Road
map / Todo sketch
The document aims to describe what to
become of the tydemux project in order to take it to the next step in
form of usability and functionality. It's also providing a sketch of
what needs to be implemented in order to reach the goals both in
terms of out line algorithms and possible open source code bases that
could be utilized. We should try to use as much open source code from
other project as possible in order to minimize development time.
However the code we take should be of good quality and properly
documented otherwise we will just end up with a lot of unmaintaible
code.
Preliminary goals for the two next up
coming versions of the tydemux project.
0.5.0:
Internal muxing of audio and video,
supporting the STD, SVCD and DVD MPEG standards
Creation
of VTS and VIDEO_TS IFO/BUP files in case we are muxing to DVD MPEG
standard.
Creation
of cell structure with in the DVD VOB – hence supporting next
and previous buttons on a DVD player but not chapters
Internal
audio transcoding
Extend the way of reading tystreams
– i.e. socket, stdin, fifo support
Code cleanup, threading and
profiling
GUI front end acting both a editor
and as a front end to the actual demuxing/remuxing operation
Support of the tytar format
Fix of all outstanding bugs ;)
0.6.0:
Direct generation of DVD iso images
Generation of DVD chapters and menus
accessing those chapters (i.e. DVD Menu creation)
Generation of DVD subtitles rendered
from CC (or TT data if possible) data present in the tystream.
Generation of mulititle DVD's i.e.
several recordings on the same DVD accessible from a top level menu
Possible GUI front end to the above
functionality
Tydemux server on the Tivo for
direct transport and management of tystreams
Techinical description of how to
reach the above stated goals.
Description of 0.5.0 goals:
Internal muxing:
The internal muxing will be based on
mplex from mjpeg-tools, it's simply the most feature complete and the
most accurate open source mpeg system stream multiplexer currently
present.
We already have “crude”
ports of mplex to windows based on the latest 1.6.0 release of
mjpeg-tools. What is needed be done here is to get the latest CVS
version of mplex (present in the mjpeg-play module in the mjpeg tool
CVS server at SourceForge), and port it to Windows (and Linux). This
includes creation of VC6 project files and Unix make files since we
can't use mjpeg-tools building system.
We must also alter mplex i.e. so we
can call it from tydemux hence we need to make it into a lib. The
fuctions needed is:
A way to tell mplex to halt the mux
since we need to demux more data
A way for mplex to read the demuxed
data originating from tydemux – I suggest feeding mplex with
either a memory buffer controlled by tydemux or let it read from
FIFO:s.
A way to skip the init of mplex
hence the probe functionality of mplex must be bipassed. Tydemux
will instead populate the data structures that mplex needs in order
to mux the streams
A way of telling mplex what type of
mux we want
A way of telling mplex what file to
write to
Creation of VTS and VIDEO_TS IFO/BUP
files:
This can be done in two alternative
ways either we extend mplex to create the necessary information in
the privet steam 2 PCI/DCI packets or we have a separate function
that is scanning the muxed stream created by mplex. The latter one is
probably the simplest approach it's done by e.g. the dvd-author
program. We also need this scanning function to later on create/alter
button definitions and commands in the private stream 2.
The information present in the
PCI/DSI packets are also a necessary for the creation of the VTS
IFO/BUP file hence my suggestion is that we go with a separate
routine at least for the time being. Hence what is needed to archive
this goal is:
A way to scan the resulting muxed
stream while it's under creation.
A way to populate the muxed stream
PCI/DSI packets while it's under creation
A way to store the information
created during the two steps above and from this information create
the VTS IFO and BUP file
Creation of cell structure with in
the DVD VOB:
This is just an extension of the
aforementioned IFO/BUP file creation it's a relatively simple
operation compared to the aforementioned one. However I don't suggest
that we take the dvd-author code and simply integrate it. The code is
not documented at all and it's very crude (and to some extent buggy).
Instead I suggest that we (Olaf) implement the bare minimum of
functionality to libifoedit. Libifoedit is a better way to go in the
long run and it will scale to the needs we have for 0.6.0. Hence both
the aforementioned IFO/BUP creation as well as cell structure
creation will be made with help of libifoedit. Hence what is needed
to archive this goal is:
Internal audio trans coding:
Rowan is the main architect of this
part of the project. He has made good progress and we have fully
functional high quality MPEG audio trans coder library and front end
called tytranscode. What is remaining is to support AC3/a52 trans
coding.
(My
own personal notes: If I fully understand the lib then it's very easy
to use. Basically feed the audio frames to the AudioInputFrame
instead of writing to disk then collect trans coded frames with
AudioGetOutPutFrame until the function returns NULL. Continue to do
this operation until the stream is finished where you can if you want
use AudioGetLastFrame to fetch the last sound clip that might just be
half full.
Basically if we have made all sync corrections before calling
InputFrame / OutPutFrame we will be home free since we will never add
any “time” to the audio stream (unless we use GetLast
which we should not do). )
Since we have full control of the audio format with help of the
AudioInitMpegAudioConverter
func we will have no problem feeding those audio variables to the
mplex (what about frame size and play time for each frame? - if I'm
not totally incorrect mplex needs those in order to mux –
remember we will not probe the stream with mplex but set the values
our self.
(My own personal notes: In regards to
silent frames and the case of adding frames to compensate for sync
drift in the audio stream. I most probably don't want to create
silent audio frames of the format that we transcode to but I think a
good idea is to init two instances of the transcode one “one to
one format” and one “one to another format” (if we
don't trans code we will just init a “one to one format”).
Anyhow when we detect sync errors in
the audio stream – i.e. missing frames then we will just call
the “one to one format” instance and request X number of
silent frames needed to mend the error. The advantage of doing it
with the “one to one format” instance is that we don't
need to worry about facts such as one frame is mapping to say 0.75
frames in the new format.)
Extend the way of reading tystreams –
i.e. socket, stdin, fifo support:
Docket and stdin is input only ways
of reading tystreams and fifo is both input and output read/write –
well we can write to stdout if we mux but it's kind of hard to write
to stdout if we just demux – two streams you know.
The issue of not reading from a file
is that we 1: don't know the size of the stream, 2: We don't know
what the stream contains in from of audio and video parameters. Now
we could sample some of it when we read from socket/fifo/stdin. But
we must provide the users with a method of setting at least one
crucial fact namely if we have AC3 or MPEG sound (the rest can be
sampled from the input – although it could be nice to provide
switches for it). Hence what is needed to archive this goal is:
A revamped read chunk function it
will need to be able to read form all supported sources.
A revamped probe (collector)
functions so that they work regardless of what type of input we use
A revamped skip to requested audio
format function
Revamped
write functionality so that we can write to at least to fifo and
file
In regards to socket support I would like to say that we should write
it in such way that we are open for the future extensions such as the
tyserver running on the Tivo (0.6.0)
Open source code for handing all of this can be found in ffmpeg,
mplayer and the video lan project (see links).
Code cleanup, threading and
profiling:
Large
parts of the code cleanup is done (remains to port 0.4.x fixes up to
the CVS version) we have grouped similar functionality in to sets of
files beginning with the same name. The tydemux.h header file holds
all functions present in tydemux and they are properly documented.
There should be know problem to generate a high quality documentation
from it with doxygen (spell checking still needs to be done through
out tydemux – any one? - is so please register at source forge
and email us at the list). However there is still work that needs to
be done mainly code duplication unnessesay functions and so on.
We could also be more consistent with the naming schema of the
functions.
Tydemux is today relatively easy to
thread, we have some “main” loops that is basically
operating independently.
read_chunk, check_chunk, add_chunk,
check_junk_chunks
parse_chunk
tystream_init
repair_tystream
check_fix_video_pes_holder
fix_seq_header
check_fix_p_frame
check_fix_tmp_ref
check_fix_fields
check_fix_av_drift
check_audio_sync (done in the write
phase at the moment)
transcode_audio (not there yet)
Any sub functions here that can be
threaded ??
remux (not there yet)
scan_ifo (not there yet)
write output (not there yet well not
in the way it should be)
As said the functions above operate
independently – they are basically doing their work until a
parameter halts them. They then wait until the next round in the big
(real main) loop allows them to test again if they can do their
specific task.
As it is today however the control
mechanism to control when they should stop isn't there – hence
if we fail in the check_fix_tmp_ref function tydemux will happily
continue to all the tasks before that function hence we will cache
e.g. chunks as we thought memory was free (well it's cheap but we
still have limitations :) ).
The
control mechanism for the read_chunk, check_chunk, add_chunk,
check_junk_chunks, parse_chunk, tystream_init and repair_tystream
functions are also in very bad shape and dates back to version 0.1 or
something like that. They need to have control functions (if they
should do they work or not) implemented in similar way as the e.g.
the check_fix_field function.
Discussion point :) - John suggested that we use pipes to shuffle
data between each thread – I want to add that reading/writing
in designated memory areas i.e. shuffle pointers between each thread
is much better – (I'm the armature here so spank me :)). My
reason is that mplex can very easily be modified to read from memory
instead of files – and it's very easy to feed mplex the
pointer. Hence we will only need a minimum of changes to allow mplex
to run as a separate thread dealing with the muxing of our two or
more ES streams. Isn't the point with threading that we share the
same address space – if we use pipes then we can more or less
fork or sub prosses which is not as graceful as threads.
Anyhow what is needed to be done is:
A way
of dealing with threads internally that is transparent regardless if
we use Windows or Unix threading (posix threads) – i.e. In the
same way as we today deal with e.g. file reading/writing.
Control
functions so we don't have runaway threads
Control
functions similar to check_fix_field so each function/thread halts
until he can start again.
A way
of shuffle data between each function/thread
Determine
how many threads we should have – we don't want to over kill
here and use more than we really need.
Profiling, since we want to thread I suggest that we make a profiling
of tydemux,tytranscode etc. This will enable us to do the right type
of optimization of the code. If you have a Linux workstation I
suggest using Valgrid/kcachgrid which is a very nice tool to do this
with – please see “appendix”.
GUI front end acting both a editor
and as a front end to the actual demuxing/remuxing operation:
(rewrite of initial mail from Olaf
Beck)
How to implement a good GUI to edit
cut points in a tystream. If you take a look at the TyStream document
you will see that the header of each ty_packet_header has a
seq_record_nr. If the record isn't set to 0x7fff then we have a
video_sequence (SEQ) present in the chunk at the record that the
number in the seq_record_nr indicates.
We
know that after a ty_record_type 0x7e0 (SEQ) will follow a
ty_record_type 0xce0 which is a GOP header. We also know that
following that will be a ty_record_type 0x8e0 which is a MPEG
I-Frame. There will also be a packet elementary stream header either
before the SEQ (Tivo S2) or between the GOP and the I-Frame (Tivo
S1). The PES header will give us the exact start/stop time between
each GOP.
The aforementioned is the key to fast
and efficient indexing – why use separate indexing when the
TyStream is already providing us with a perfect source to create a
index of. Anyways if we now have a struct like this (will be a bit
more entries in it but it's a sketch)
typedef struct Gop_index_t
gop_index_t;
struct Gop_index_t {
/* Chunk number of seq */
int chunk_number_seq;
/* Record number for SEQ */
int seq_rec_nr;
/* Chunk number of I-Frame */
int chunk_number_i_frame;
/* Record number of I-Frame */
int i_frame_rec_nr;
/* Temporal reference */
uint64_t time_of_iframe;
/* Access pointers */
gop_index_t * next;
gop_index_t * previous;
};
and a function like this
gop_index_t *
scan_tystream(tysteam_holder_t * tystream);
What we get is a linked list holding
the index of a tystream which is what we want.
Now let us step through this:
The user opens the file in the editor
- what we first do is probing the stream and find where the audio is
starting. We are also finding out what seq/gop that we will start the
video/audio stream with if we wasn't about to cut.
Given that information we run the
scan_tystream to create the index (this is a very fast operation
since we only need to read either the two byte entry for the header
and optionally say around 200 bytes of record info. I can't give you
any real times but I did some tests and it's very fast i.e. More or
less real time :). It should be fast since it's what Tivo is using to
scan the stream when doing ffw fbw.
Now we read the first entry in the
index list. From it we get the chunk number. We read in three chunks
from the starting chunk of the seq_header with a revamped read_chunk
func (that is seeking and not as today need to be at the right
location but instead takes a chunk number).
Then we use the get_video func to
fetch the I-Frame that is in rec_nr Y and in chunk X. We feed the
payload_t:s payload_data to a mpeg2 video decoder and displays it to
the user in the GUI. This can be done in real time so the user will
not notice a big delay.
Anyways - the good thing is that we
know the chunk numbers of the cut, we know the times of the cut, and
we know the exact seq number of the cut. Feeding this info to the
cutpoint function and we will be home free to make extremely precise
cuts in the stream. For the video cuts will use the SEQ info while
the audio stream will use the times. The problem with using times for
the video is that we need to take care about dangling B-Frames.
B-Frames that is displayed before the I-Frame but decoded after the
I-Frame is decoded. Display order is like this: B B I B B P while
decoding order is like this: I B B P B B.
As for the GUI it self I suggest QT
it's cross platform and it's a very solid toolkit (e.g. the whole of
KDE see www.kde.org is built on it). I have some example code of a QT
tiff viewer, it's using the same principle that we will use to
display frames with although they use the tiff library and not the
mpeg2 video library. We naturally have to have code to scroll
backwards and fwd, inserting cut points and so forth.
Anyhow what is needed to be done is
(besides writing the GUI):
A
scan function so we can create a index list that is
numbered (seq numbers)
A get index XY function – that
is returning the image (either in mpeg format or in raw image
format) along with display time, chunk number etc i.e. info that the
user might benefit from.
A revamped cutpoint function that
can take a index and a list of cutpoints (XY to YX i.e. seq numbers)
and transform that into a cut list.
A revamped cutpoint_is_in_remove_cut
that handles times for audio and seq numbers for Video
A revamped tystream_repair function
that handles sync offsets in a better style and adjusted for both
cuts and repairs
A jdiner cut list to tydemux cut
list function
A new reader/writer of our own cut
list format
High level functions to handle
reading, parsing, checking etc in a good way so that the GUI can do
the main demux/remux and cut of the tystream.
Find a good (and especially fast)
mpeg2 video decoder lib
Fucntions to set tystream parameters
from the GUI
Stubs for the above functions as
soon as we can (yesterday if possible :)) - we need to start coding
the GUI.
Support for the tytar format:
Initially I suggest that we support
this format in unpacked (i.e. you have untared the file). There is
functions to read tar files directly but I think we should limit us a
bit here – basically we have enough to do already. However we
will need to solve the problem that the input file is in several
parts. This is however very similar to VOB files on a DVD.
The good thing is that libdvdread
already has the functions that we need to enable us to access the
different “tar” parts as one big file. Hence supporting
the tar file format should be easy (unless we want to opt to read the
tar file directly).
Anyhow what is needed to be done is:
Fix of all outstanding bugs:
There is a lot of outstanding bugs
and issues in the code base today. A lot of it will be taken care of
due to revamping of the code to provide functionality demanded of
other parts in the project hence I have excluded those from the list
below.
A reset of the PTS in the video
stream makes us totally bail out (have example stream). The PTS is
just a unsinged 33 bit integer hence it will turn around and we need
to deal with it (this is another reason why times are a bad thing in
a cut list).
If we have dual GOP:s in a seq,
tmp_ref checks will fail (we must make sure we have a seq with every
gop)
Check seq not gop when we seek seqs
– currently we stop at both gops and seqs in a lot of our
functions
Check progressive frame when adding
frames to correct sync
Make a separate gop check when
seeking seqs – fix gops without seq
Make a better solution when a the
last P frame is missing from a gop
Check when getting highest tmp ref
in the next seq – this is a general check since I think I made
some mistakes here.
A tmp_ref like this is valid: 2I 0B
1B 3P 6P 4B 5B yet check_tmp_ref is failing i.e. It says it's
invalid and we rebuild the gop like this 2I 3P 0B 1B 6P 4B 5B which
also is a valid tmp ref. However it's not as effective as the 2I 0B
1B 3P 6P 4B 5B when it comes to decoder buffers hence we need to
fix it.
tmp_ref check when we have missing P
frames in the end is very experimental and we need to check it.
Change all labs to llabs so they can
handle 64 bit ints
Change all uint64_t to int64_t (PTS
time stamps and it's 33bit uint so a int64_t should be able to
handle it. This will make a lot of code more effective and we don't
need to worry that much – this clean up demands a lot of work
but it's defiantly worth it.
Fixing repair tystream when handling
SA/UK repairs we cut way to much today – it will not work when
we start dealing with cuts
Fixing repair tystream when handling
S2 SA video
Enable support for S2 Dtivo (needs
fix in make_gop_closed)
Fix start of S2 SA if we have jumped
into the stream – we will fail as it is now!
Drop 1.3/1.5 support ?? Yes, I would
say will clean up the code and I will for sure not support fancy
things like e.g. cutting in a 1.3/1.5 stream.
We leek memory in the parse video
functions if we have gaps in the stream
We will need to have a much better
func to deal with times in the tmp_ref function
We also need our own private ftp/web
site where we can store tystreams with different kind of errors as
well as perfect streams of all the combinations out there.
As always the document is up for
discussion – ahh one last thing we need to have a common code
format otherwise it 1: Will not look good 2: Trouble to read each
others code – take a look in the Hacking_Roules_Tips file
present in the CVS top level tydemux dir – as said it's up to
discussion but we will need to think something out.
Links, books, documents etc:
MPEG:
AC3/a52:
DVD:
GUI Toolkit (QT):
Troll tech a given perfect site
lots of info – well they are the company behind it. Note that
the free X11 version is 3.1.x and the free Win version is 2.x so
some small steps need to be taken care of when writing the code.
There is no native free version for MAC (it costs) but since Apple
has it's own X server we can just use the free X11 version. There
might alsp be someone out there with a comersial version of QT for
MAC that can compile a native version for us. Anyhow the link to
troll tech is http://www.trolltech.com
QT Tiff viewer example code
http://qfaxreader.sourceforge.net/
Socket handling of tystreams:
Development tools:
Valgridn advertisment – this is
the best thing that has happened to Linux programming since sliced
bread. It's has a similar functionality as Purify but with the extra
advantage that you don't need to compile the sources with it.
Just run “valgrind your_prog”
and it will report every error your program is making along with all
memory leeks that you have. Way cool I (Olaf) must say.
Even better is that you can use a
program called kcachegrind it is a extension to valgrind – and
it lets you visualize the profiling data when you run a progam with
valgrind hence you don't need to compile in profing support - Way
cool I (Olaf) must say :).
The link to valgrind is
http://developer.kde.org/~sewardj/ and the link to kcachegrind is
http://kcachegrind.sourceforge.net/ both sites
holds extensive info about how to use the programs.
Both programs is a must in any Linux
programmers tool chest – it's even useful for Windows
programmers if the code is cross platform as in tydemux case
|