Tydemux


Road map / Todo sketch


The document aims to describe what to become of the tydemux project in order to take it to the next step in form of usability and functionality. It's also providing a sketch of what needs to be implemented in order to reach the goals both in terms of out line algorithms and possible open source code bases that could be utilized. We should try to use as much open source code from other project as possible in order to minimize development time. However the code we take should be of good quality and properly documented otherwise we will just end up with a lot of unmaintaible code.



Preliminary goals for the two next up coming versions of the tydemux project.



0.5.0:


0.6.0:



Techinical description of how to reach the above stated goals.


Description of 0.5.0 goals:



Internal muxing:

The internal muxing will be based on mplex from mjpeg-tools, it's simply the most feature complete and the most accurate open source mpeg system stream multiplexer currently present.


We already have “crude” ports of mplex to windows based on the latest 1.6.0 release of mjpeg-tools. What is needed be done here is to get the latest CVS version of mplex (present in the mjpeg-play module in the mjpeg tool CVS server at SourceForge), and port it to Windows (and Linux). This includes creation of VC6 project files and Unix make files since we can't use mjpeg-tools building system.


We must also alter mplex i.e. so we can call it from tydemux hence we need to make it into a lib. The fuctions needed is:



Creation of VTS and VIDEO_TS IFO/BUP files:

This can be done in two alternative ways either we extend mplex to create the necessary information in the privet steam 2 PCI/DCI packets or we have a separate function that is scanning the muxed stream created by mplex. The latter one is probably the simplest approach it's done by e.g. the dvd-author program. We also need this scanning function to later on create/alter button definitions and commands in the private stream 2.


The information present in the PCI/DSI packets are also a necessary for the creation of the VTS IFO/BUP file hence my suggestion is that we go with a separate routine at least for the time being. Hence what is needed to archive this goal is:


Creation of cell structure with in the DVD VOB:

This is just an extension of the aforementioned IFO/BUP file creation it's a relatively simple operation compared to the aforementioned one. However I don't suggest that we take the dvd-author code and simply integrate it. The code is not documented at all and it's very crude (and to some extent buggy). Instead I suggest that we (Olaf) implement the bare minimum of functionality to libifoedit. Libifoedit is a better way to go in the long run and it will scale to the needs we have for 0.6.0. Hence both the aforementioned IFO/BUP creation as well as cell structure creation will be made with help of libifoedit. Hence what is needed to archive this goal is:



Internal audio trans coding:

Rowan is the main architect of this part of the project. He has made good progress and we have fully functional high quality MPEG audio trans coder library and front end called tytranscode. What is remaining is to support AC3/a52 trans coding.


(My own personal notes: If I fully understand the lib then it's very easy to use. Basically feed the audio frames to the AudioInputFrame instead of writing to disk then collect trans coded frames with AudioGetOutPutFrame until the function returns NULL. Continue to do this operation until the stream is finished where you can if you want use AudioGetLastFrame to fetch the last sound clip that might just be half full.


Basically if we have made all sync corrections before calling InputFrame / OutPutFrame we will be home free since we will never add any “time” to the audio stream (unless we use GetLast which we should not do). )


Since we have full control of the audio format with help of the AudioInitMpegAudioConverter func we will have no problem feeding those audio variables to the mplex (what about frame size and play time for each frame? - if I'm not totally incorrect mplex needs those in order to mux – remember we will not probe the stream with mplex but set the values our self.

(My own personal notes: In regards to silent frames and the case of adding frames to compensate for sync drift in the audio stream. I most probably don't want to create silent audio frames of the format that we transcode to but I think a good idea is to init two instances of the transcode one “one to one format” and one “one to another format” (if we don't trans code we will just init a “one to one format”).


Anyhow when we detect sync errors in the audio stream – i.e. missing frames then we will just call the “one to one format” instance and request X number of silent frames needed to mend the error. The advantage of doing it with the “one to one format” instance is that we don't need to worry about facts such as one frame is mapping to say 0.75 frames in the new format.)



Extend the way of reading tystreams – i.e. socket, stdin, fifo support:

Docket and stdin is input only ways of reading tystreams and fifo is both input and output read/write – well we can write to stdout if we mux but it's kind of hard to write to stdout if we just demux – two streams you know.


The issue of not reading from a file is that we 1: don't know the size of the stream, 2: We don't know what the stream contains in from of audio and video parameters. Now we could sample some of it when we read from socket/fifo/stdin. But we must provide the users with a method of setting at least one crucial fact namely if we have AC3 or MPEG sound (the rest can be sampled from the input – although it could be nice to provide switches for it). Hence what is needed to archive this goal is:



In regards to socket support I would like to say that we should write it in such way that we are open for the future extensions such as the tyserver running on the Tivo (0.6.0)


Open source code for handing all of this can be found in ffmpeg, mplayer and the video lan project (see links).


Code cleanup, threading and profiling:

Large parts of the code cleanup is done (remains to port 0.4.x fixes up to the CVS version) we have grouped similar functionality in to sets of files beginning with the same name. The tydemux.h header file holds all functions present in tydemux and they are properly documented. There should be know problem to generate a high quality documentation from it with doxygen (spell checking still needs to be done through out tydemux – any one? - is so please register at source forge and email us at the list). However there is still work that needs to be done mainly code duplication unnessesay functions and so on. We could also be more consistent with the naming schema of the functions.


Tydemux is today relatively easy to thread, we have some “main” loops that is basically operating independently.

As said the functions above operate independently – they are basically doing their work until a parameter halts them. They then wait until the next round in the big (real main) loop allows them to test again if they can do their specific task.


As it is today however the control mechanism to control when they should stop isn't there – hence if we fail in the check_fix_tmp_ref function tydemux will happily continue to all the tasks before that function hence we will cache e.g. chunks as we thought memory was free (well it's cheap but we still have limitations :) ).


The control mechanism for the read_chunk, check_chunk, add_chunk, check_junk_chunks, parse_chunk, tystream_init and repair_tystream functions are also in very bad shape and dates back to version 0.1 or something like that. They need to have control functions (if they should do they work or not) implemented in similar way as the e.g. the check_fix_field function.


Discussion point :) - John suggested that we use pipes to shuffle data between each thread – I want to add that reading/writing in designated memory areas i.e. shuffle pointers between each thread is much better – (I'm the armature here so spank me :)). My reason is that mplex can very easily be modified to read from memory instead of files – and it's very easy to feed mplex the pointer. Hence we will only need a minimum of changes to allow mplex to run as a separate thread dealing with the muxing of our two or more ES streams. Isn't the point with threading that we share the same address space – if we use pipes then we can more or less fork or sub prosses which is not as graceful as threads.


Anyhow what is needed to be done is:


Profiling, since we want to thread I suggest that we make a profiling of tydemux,tytranscode etc. This will enable us to do the right type of optimization of the code. If you have a Linux workstation I suggest using Valgrid/kcachgrid which is a very nice tool to do this with – please see “appendix”.



GUI front end acting both a editor and as a front end to the actual demuxing/remuxing operation:


(rewrite of initial mail from Olaf Beck)


How to implement a good GUI to edit cut points in a tystream. If you take a look at the TyStream document you will see that the header of each ty_packet_header has a seq_record_nr. If the record isn't set to 0x7fff then we have a video_sequence (SEQ) present in the chunk at the record that the number in the seq_record_nr indicates.


We know that after a ty_record_type 0x7e0 (SEQ) will follow a ty_record_type 0xce0 which is a GOP header. We also know that following that will be a ty_record_type 0x8e0 which is a MPEG I-Frame. There will also be a packet elementary stream header either before the SEQ (Tivo S2) or between the GOP and the I-Frame (Tivo S1). The PES header will give us the exact start/stop time between each GOP.


The aforementioned is the key to fast and efficient indexing – why use separate indexing when the TyStream is already providing us with a perfect source to create a index of. Anyways if we now have a struct like this (will be a bit more entries in it but it's a sketch)


typedef struct Gop_index_t gop_index_t;

struct Gop_index_t {


/* Chunk number of seq */

int chunk_number_seq;


/* Record number for SEQ */

int seq_rec_nr;


/* Chunk number of I-Frame */

int chunk_number_i_frame;


/* Record number of I-Frame */

int i_frame_rec_nr;


/* Temporal reference */

uint64_t time_of_iframe;


/* Access pointers */

gop_index_t * next;

gop_index_t * previous;


};



and a function like this


gop_index_t * scan_tystream(tysteam_holder_t * tystream);


What we get is a linked list holding the index of a tystream which is what we want.


Now let us step through this:


The user opens the file in the editor - what we first do is probing the stream and find where the audio is starting. We are also finding out what seq/gop that we will start the video/audio stream with if we wasn't about to cut.


Given that information we run the scan_tystream to create the index (this is a very fast operation since we only need to read either the two byte entry for the header and optionally say around 200 bytes of record info. I can't give you any real times but I did some tests and it's very fast i.e. More or less real time :). It should be fast since it's what Tivo is using to scan the stream when doing ffw fbw.


Now we read the first entry in the index list. From it we get the chunk number. We read in three chunks from the starting chunk of the seq_header with a revamped read_chunk func (that is seeking and not as today need to be at the right location but instead takes a chunk number).


Then we use the get_video func to fetch the I-Frame that is in rec_nr Y and in chunk X. We feed the payload_t:s payload_data to a mpeg2 video decoder and displays it to the user in the GUI. This can be done in real time so the user will not notice a big delay.


Anyways - the good thing is that we know the chunk numbers of the cut, we know the times of the cut, and we know the exact seq number of the cut. Feeding this info to the cutpoint function and we will be home free to make extremely precise cuts in the stream. For the video cuts will use the SEQ info while the audio stream will use the times. The problem with using times for the video is that we need to take care about dangling B-Frames. B-Frames that is displayed before the I-Frame but decoded after the I-Frame is decoded. Display order is like this: B B I B B P while decoding order is like this: I B B P B B.


As for the GUI it self I suggest QT it's cross platform and it's a very solid toolkit (e.g. the whole of KDE see www.kde.org is built on it). I have some example code of a QT tiff viewer, it's using the same principle that we will use to display frames with although they use the tiff library and not the mpeg2 video library. We naturally have to have code to scroll backwards and fwd, inserting cut points and so forth.


Anyhow what is needed to be done is (besides writing the GUI):



Support for the tytar format:

Initially I suggest that we support this format in unpacked (i.e. you have untared the file). There is functions to read tar files directly but I think we should limit us a bit here – basically we have enough to do already. However we will need to solve the problem that the input file is in several parts. This is however very similar to VOB files on a DVD.


The good thing is that libdvdread already has the functions that we need to enable us to access the different “tar” parts as one big file. Hence supporting the tar file format should be easy (unless we want to opt to read the tar file directly).


Anyhow what is needed to be done is:

Fix of all outstanding bugs:

There is a lot of outstanding bugs and issues in the code base today. A lot of it will be taken care of due to revamping of the code to provide functionality demanded of other parts in the project hence I have excluded those from the list below.




We also need our own private ftp/web site where we can store tystreams with different kind of errors as well as perfect streams of all the combinations out there.

As always the document is up for discussion – ahh one last thing we need to have a common code format otherwise it 1: Will not look good 2: Trouble to read each others code – take a look in the Hacking_Roules_Tips file present in the CVS top level tydemux dir – as said it's up to discussion but we will need to think something out.




Links, books, documents etc:


MPEG:


AC3/a52:


DVD:


GUI Toolkit (QT):


Socket handling of tystreams:



Development tools:

Valgridn advertisment – this is the best thing that has happened to Linux programming since sliced bread. It's has a similar functionality as Purify but with the extra advantage that you don't need to compile the sources with it.


Just run “valgrind your_prog” and it will report every error your program is making along with all memory leeks that you have. Way cool I (Olaf) must say.


Even better is that you can use a program called kcachegrind it is a extension to valgrind – and it lets you visualize the profiling data when you run a progam with valgrind hence you don't need to compile in profing support - Way cool I (Olaf) must say :).


The link to valgrind is http://developer.kde.org/~sewardj/ and the link to kcachegrind is http://kcachegrind.sourceforge.net/ both sites holds extensive info about how to use the programs.


Both programs is a must in any Linux programmers tool chest – it's even useful for Windows programmers if the code is cross platform as in tydemux case