Developers: | Spirit DSP |
Date of the premiere of the system: | 2016/07/12 |
Branches: | Telecommunication and communication |
Technology: | Speech technologies |
Content |
Spirit Voice Storage Codec (VoSToC) is the speech codec.
2016
Patent application
On September 29, 2016 the company SPIRIT announced application for obtaining the patent for technology of compression of the multimedia information (video and a voice) large volume for storage in systems data storage (DWH).
Voice G.729, G.723.1 codecs squeeze the speech approximately by 8 times (with losses), after this SPIRIT can squeeze the speech by 3 times. H.263/H.264 video codecs squeeze video on average approximately by 50 times (with losses), after this SPIRIT can squeeze video by 100 times, by intellectual information analysis on the basis of its importance, the developer company stated.
When it comes about search of the best solution for storage of large volume of multimedia data, it is necessary to pay attention to two key parameters: to a compression ratio and quality of a recorded speech. At us are developed and are already used in the world market of technology of compression of a voice which allow to write nearly 2 hours of the vokodirovanny speech in 1 MB of memory. The law requires to write and store all multimedia data and if the text does not take a lot of place in DWH, then storage of the speech, audio and video will require on orders big resources. There is no need to store all "information garbage", it is possible to write a less informative part of data more compactly, and here all most important fragments of record should be provided qualitatively. |
The offered concept of integration of intellectual classification of different fragments of the multimedia information and the known encoding techniques of a source with losses developed by SPIRIT helps to increase efficiency of compact submission of the multimedia information and allows to answer questions of meta-level:
- what to store,
- how to store,
- in what format to store,
- how to determine the value of the stored information for the systems of decision-making,
- as to code it and to write in digital memory.
Technical details
For coding of video H.26X ITU-T codecs, proprietary VPh codecs are used. The compression ratio varies from 10 to 500 times depending on the admissible speed of a video flow or the selected amount of memory for storage of images/video, level of the set quality, type of the codec and specifics of images/video.
The principle of image compression and video used in such codecs with losses — elimination of space redundancy based on transition from spatial domain in frequency based on transformation of a matrix of the image, as in JPEG (or a reference frame, as in MPEG 2 and 4), and uses of a system of orthogonal functions (Fourier transform, Walsh, discrete cosine transform (DCT), veyvlet, etc.) and thin or coarse quantization of components that enters an error, and the subsequent coding of quantized components on the principles of entropy encoding without loss (in particular, arithmetic coding); elimination of time redundancy in the next personnel of a video flow in which, as a rule, there are little changes due to the movement of objects in a frame or the video cameras revealed by appraisal remedies of the movement (motion estimator) and determination of motion vectors for coding only of the changed fragments of a new frame in relation to reference (the detailed description in the MPEG 2 and 4 standards, N. of the 26th).
For a speech coding speech codecs according to standards of ITU-T of the G.7xx series (G.711, G.718, G.719, G.722.2 (AMR WB), G.723.1, G.726, G.729, G.729.1, etc.), codecs GSM, SILC, iLBC and other proprietary codecs are used. The compression ratio varies from 5 to 50 times depending on the required speed of a speech flow at the output of the coder, level of the set quality, an admissible delay and specifics of a voice signal (taking into account pauses in the speech). If the form of an original signal remains at the output of the codec with a controlled error, then such codecs are called codecs of a speech wave (waveform codecs).
For coding of audiosignals MP3, AAC, AAC audiocodecs +, by WMA, etc. are used. Practically all audiocodecs are created on the basis of the waveform coding method, but processing of a signal is made, as a rule, in frequency domain. The compression ratio of an audio stream varies from 5 to 30 times and depends on audio bandwidth and required quality of audio playback when decoding.
VoSToC is the codec with high compression ratio of voice data
On July 12, 2016 the SPIRIT company announced an output to the market of the speech codec for the applications requiring compression and storage of large volumes of voice data of VoSToC (Voice Storage Codec).
As the developer company stated, SPIRIT VoSToC is the special vocoder working at speed of 2400 bps and focused on storage of the speech. On quality of reproduction of the speech of VoSToC in the class exceeds world analogs. For the codecs intended for storing of multimedia data (in particular, for codecs of a voice signal), there is no need to save the short algorithmic delay important for bilateral communication in real time that allows by more effective processing of a signal to increase quality of the decoded speech. Such approach is used in the speech SPIRIT VoSToC codec - at a low speed of the vocoder (2400 bps) signal quality after decoding, inherent to codecs with higher speeds is provided.
When it about search of the best solution for storage of large volume of multimedia data as now within "A packet of the Yarovaya Laws", it is necessary to pay attention to two key parameters comes: to a compression ratio and quality of a recorded speech. In SPIRIT are developed and are widely licensed in the world market of technology of a speech compression which allow to write nearly 2 hours of the vokodirovanny speech in 1 MB memories. At the same time SPIRIT uses ad hoc methods of processing of a voice signal for coding that provides high-quality reproduction of a recorded speech. It what is required to the Russian telecom operators and the Russian vendors of the equipment for storage of multimedia data for minimization of expenses at execution of new statutory requirements of storage of a call recording today. |
According to opinion of developers, the VoSToC codec among nizkobitreytny speech SPIRIT codecs (at speeds of 1200, 2400, 3600, 4800, 6000, 8000 bps) which codings of a voice are not of a lower quality than world standard vocoders in the class. Also SPIRIT developed and uses in the engine for voice and video communications of IP-MR - the scalable codec working at different speeds capable to provide a qualitative speech coding at the changing capacity of the channel.