...A Planet of Partners™

  • Increase font size
  • Default font size
  • Decrease font size

Patents

 
Methods and systems for virtualizing audio hardware for one or more virtual machines
Tue, 03 Mar 2015 08:00:00 EST
The present disclosure is directed towards methods and systems for virtualizing audio hardware for one or more virtual machines. A control virtual machine (VM) may translate a first stream of audio functions calls from a first VM hosted by a hypervisor. The translated first stream of audio function calls may be destined for a sound card of the computing device executing the hypervisor. The control VM may detect a second stream of audio functions calls from a second VM hosted by the hypervisor. The control VM may translate the second stream of audio functions calls from the second VM. The control VM may further merge the translated first stream of audio function calls and the translated second stream of the audio function calls in response to the detected second stream. The control VM may transmit the merged stream of audio function calls to the sound card.
Method and apparatus for providing case restoration
Tue, 03 Mar 2015 08:00:00 EST
A method and apparatus for providing case restoration in a communication network are disclosed. For example, the method obtains one or more content sources from one or more information feeds, and extracts textual information from the one or more content sources obtained from the one or more information feeds. The method then creates or updates a capitalization model based on the textual information.
Time ordered indexing of an information stream
Tue, 03 Mar 2015 08:00:00 EST
Methods and apparatuses in which two or more types of attributes from an information stream are identified. Each of the identified attributes from the information stream is encoded. A time ordered indication is assigned with each of the identified attributes. Each of the identified attributes shares a common time reference measurement. A time ordered index of the identified attributes is generated.
Machine translation using information retrieval
Tue, 03 Mar 2015 08:00:00 EST
Systems, methods, and apparatuses, including computer program products, are provided for machine translation using information retrieval techniques. In general, in one implementation, a method is provided. The method includes providing a received input segment as a query to a search engine, the search engine searching an index of one or more collections of documents, receiving one or more candidate segments in response to the query, determining a similarity of each candidate segment to the received input segment, and for one or more candidate segments having a determined similarity that exceeds a threshold similarity, providing a translated target segment corresponding to the respective candidate segment.
Subjective linguistic analysis
Tue, 03 Mar 2015 08:00:00 EST
A system and related method for the electronic processing of text onto a two-dimensional coordinate system to analyze the attitudinal mindset associated with the text. The system and related method may also be employed to generate text based on a desired attitudinal mindset to impart. The system includes a computer system embodying functions that enable a user to analyze the text. The system includes one or more functions to parse attitudinal words and objective words and associate two-dimensional coordinates with the subjective words. The system further includes one or more functions for mapping the associated two-dimensional coordinates to show the geographic locations of each attitudinal word of the text in relation to each other attitudinal word of the text. The system decomposes attitudinal words into attitudinal equivalence and reference category and enables the generation of a report of the mindset associated with the analyzed text.
Management of content items
Tue, 03 Mar 2015 08:00:00 EST
Disclosed are various embodiments of a content management application that facilitates a content management system. Content items that can include audio and/or video can be stored in the content management system. A transcript is generated that corresponds to spoken words within the content. Content can be tagged based upon the transcript. Content anomalies can also be detected as well as editing functionality provided.
Position invariant compression of files within a multi-level compression scheme
Tue, 03 Mar 2015 08:00:00 EST
An aggregated file is generated, by storing a plurality of initially provided files in a sequence. A computational device executes a first set of compression operations on each of the plurality of initially provided files to generate a plurality of compressed files that replace the plurality of initially provided files, wherein starting locations of the plurality of compressed files and the plurality of initially provided files are identical, and wherein predetermined bit patterns are stored in empty spaces that follow each of the plurality of compressed files. The computational device sends the aggregated file to a linear storage device configured to perform a second set of compression operations on the aggregated file.
Method and an apparatus for processing an audio signal
Tue, 03 Mar 2015 08:00:00 EST
A method for processing an audio signal is disclosed. The method for processing an audio signal includes frequency-transforming an audio signal to generate a frequency-spectrum, deciding a weighting per band corresponding to energy per band using the frequency spectrum, receiving a masking threshold based on a psychoacoustic model, applying the weighting to the masking threshold to generate a modified masking threshold, and quantizing the audio signal using the modified masking threshold.
Methods and systems for interfaces allowing limited edits to transcripts
Tue, 03 Mar 2015 08:00:00 EST
A transcript interface for displaying a plurality of words of a transcript in a text editor can be provided and configured to receive a command to edit the transcript. Limited edits to data corresponding to the transcript can be made in response to commands received via the user interface module. For example, edits may be limited to selection of a single word in the text editor for editing via a given command. The edit may affect an adjacent word in some instances, such as when two adjacent words are merged. In some embodiments, data corresponding to the selected word of the transcript is changed to reflect the edit without changing data defining the relative timing of those words of the transcript that are not adjacent to the selected word.
Enhanced speech-to-speech translation system and methods for adding a new word
Tue, 03 Mar 2015 08:00:00 EST
A speech translation system and methods for cross-lingual communication that enable users to improve and modify content and usage of the system and easily abort or reset translation. The system includes a speech recognition module configured for accepting an utterance and adding a new word, a machine translation module, an interface configured to communicate the utterance and proposed translation, a correction module and an abort action unit that removes any hypotheses or partial hypotheses and terminates translation. The system also includes modules for storing favorites, changing language mode, automatically identifying language, providing language drills, viewing third party information relevant to conversation, among other things.
Controlling audio video display device (AVDD) tuning using channel name
Tue, 03 Mar 2015 08:00:00 EST
A television, or other device with television tuner, can be controlled to directly tune to a specific channel name, such as a broadcaster's station name, by using EPG metadata to provide a correlation between a channel number and channel name.
User intent analysis extent of speaker intent analysis system
Tue, 03 Mar 2015 08:00:00 EST
A speaker intent analysis system and method for validating the truthfulness and intent of a plurality of participants' responses to questions. A computer stores, retrieves, and transmits a series of questions to be answered audibly by participants. The participants' answers are received by a data processor. The data processor analyzes and records the participants' speech parameters for determining the likelihood of dishonesty. In addition to analyzing participants' speech parameters for distinguishing stress or other abnormality, the processor may be equipped with voice recognition software to screen responses that while not dishonest, are indicative of possible malfeasance on the part of the participants. Once the responses are analyzed, the processor produces an output that is indicative of the participant's credibility. The output may be sent to proper parties and/or devices such as a web page, computer, e-mail, PDA, pager, database, report, etc. for appropriate action.
Multiple voices in audio content
Tue, 03 Mar 2015 08:00:00 EST
A content customization service is disclosed. The content customization service may identify one or more speakers in an item of content, and map one or more portions of the item of content to a speaker. A speaker may also be mapped to a voice. In one embodiment, the content customization service obtains portions of audio content synchronized to the mapped portions of the item of content. Each portion of audio content may be associated with a voice to which the speaker of the portion of the item of content is mapped. These portions of audio content may be combined to produce a combined item of audio content with multiple voices.
Method and apparatus for utterance verification
Tue, 03 Mar 2015 08:00:00 EST
A method and apparatus for utterance verification are provided for verifying a recognized vocabulary output from speech recognition. The apparatus for utterance verification includes a reference score accumulator, a verification score generator and a decision device. A log-likelihood score obtained from speech recognition is processed by taking a logarithm of the value of the probability of one of feature vectors of an input speech conditioned on one of states of each model vocabulary. A verification score is generated based on the processed result. The verification score is compared with a predetermined threshold value so as to reject or accept the recognized vocabulary.
System and method for performing dual mode speech recognition
Tue, 03 Mar 2015 08:00:00 EST
A system and method for performing dual mode speech recognition, employing a local recognition module on a mobile device and a remote recognition engine on a server device. The system accepts a spoken query from a user, and both the local recognition module and the remote recognition engine perform speech recognition operations on the query, returning a transcription and confidence score, subject to a latency cutoff time. If both sources successfully transcribe the query, then the system accepts the result having the higher confidence score. If only one source succeeds, then that result is accepted. In either case, if the remote recognition engine does succeed in transcribing the query, then a client vocabulary is updated if the remote system result includes information not present in the client vocabulary.
Indexing and search of content in recorded group communications
Tue, 03 Mar 2015 08:00:00 EST
In one embodiment, indexing content in streamed data includes receiving streams of audio data encoding a recording of a live ongoing group communication, where each stream of audio data encodes a different one of multiple voices. Each of the streams of audio data is provided to a recognizer to cause separate recognition of words in each of the streams. The recognized words are indexed to corresponding locations in each of the streams, and the streams are combined into a combined stream of audio data by synchronizing at least one common location in the streams. Embodiments allow accurate recognition of speech in group communications in which multiple speakers have simultaneously spoken, and accurate search of content encoded and processed from such speech.
Computer-implemented system and method for voice transcription error reduction
Tue, 03 Mar 2015 08:00:00 EST
A computer-implemented system and method for voice transcription error reduction is provided. Speech utterances are obtained from a voice stream and each speech utterance is associated with a transcribed value and a confidence score. Those utterances with transcription values associated with lower confidence scores are identified as questionable utterances. One of the questionable utterances is selected from the voice stream. A predetermined number of questionable utterances from other voice streams and having transcribed values similar to the transcribed value of the selected questionable utterance are identified as a pool of related utterances. A further transcribed value is received for each of a plurality of the questionable utterances in the pool of related utterances. A transcribed message is generated for the voice stream using those transcribed values with higher confidence scores and the further transcribed value for the selected questionable utterance.
Speech recognition using multiple language models
Tue, 03 Mar 2015 08:00:00 EST
In accordance with one embodiment, a method of generating language models for speech recognition includes identifying a plurality of utterances in training data corresponding to speech, generating a frequency count of each utterance in the plurality of utterances, generating a high-frequency plurality of utterances from the plurality of utterances having a frequency that exceeds a predetermined frequency threshold, generating a low-frequency plurality of utterances from the plurality of utterances having a frequency that is below the predetermined frequency threshold, generating a grammar-based language model using the high-frequency plurality of utterances as training data, and generating a statistical language model using the low-frequency plurality of utterances as training data.
System and method for teaching non-lexical speech effects
Tue, 03 Mar 2015 08:00:00 EST
A method and system for teaching non-lexical speech effects includes delexicalizing a first speech segment to provide a first prosodic speech signal and data indicative of the first prosodic speech signal is stored in a computer memory. The first speech segment is audibly played to a language student and the student is prompted to recite the speech segment. The speech uttered by the student in response to the prompt, is recorded.
Sparse maximum a posteriori (map) adaption
Tue, 03 Mar 2015 08:00:00 EST
Techniques disclosed herein include using a Maximum A Posteriori (MAP) adaptation process that imposes sparseness constraints to generate acoustic parameter adaptation data for specific users based on a relatively small set of training data. The resulting acoustic parameter adaptation data identifies changes for a relatively small fraction of acoustic parameters from a baseline acoustic speech model instead of changes to all acoustic parameters. This results in user-specific acoustic parameter adaptation data that is several orders of magnitude smaller than storage amounts otherwise required for a complete acoustic model. This provides customized acoustic speech models that increase recognition accuracy at a fraction of expected data storage requirements.
Systems and methods to present voice message information to a user of a computing device
Tue, 03 Mar 2015 08:00:00 EST
Systems and methods to process and/or present information relating to voice messages for a user that are received from other persons. In one embodiment, a method implemented in a data processing system includes: receiving first data associated with prior communications or activities for a first user on a mobile device; receiving a voice message for the first user; transcribing the voice message using the first data to provide a transcribed message; and sending the transcribed message to the mobile device for display to the user.
System and method for dynamic noise adaptation for robust automatic speech recognition
Tue, 03 Mar 2015 08:00:00 EST
A speech processing method and arrangement are described. A dynamic noise adaptation (DNA) model characterizes a speech input reflecting effects of background noise. A null noise DNA model characterizes the speech input based on reflecting a null noise mismatch condition. A DNA interaction model performs Bayesian model selection and re-weighting of the DNA model and the null noise DNA model to realize a modified DNA model characterizing the speech input for automatic speech recognition and compensating for noise to a varying degree depending on relative probabilities of the DNA model and the null noise DNA model.
Method and device for classifying background noise contained in an audio signal
Tue, 03 Mar 2015 08:00:00 EST
Embodiments of methods and devices for classifying background noise contained in an audio signal are disclosed. In one embodiment, the device includes a module for extracting from the audio signal a background noise signal, termed the noise signal. Also included is a second that calculates a first parameter, termed the temporal indicator. The temporal indicator relates to the temporal evolution of the noise signal. The second module also calculates a second parameter, termed the frequency indicator. The frequency indicator relates to the frequency spectrum of the noise signal. Finally, the device includes a third module that classifies the background noise by selecting, as a function of the calculated values of the temporal indicator and of the frequency indicator, a class of background noise from among a predefined set of classes of background noise.
Turbo processing for speech recognition with local-scale and broad-scale decoders
Tue, 03 Mar 2015 08:00:00 EST
Environmental recognition systems may improve recognition accuracy by leveraging local and nonlocal features in a recognition target. A local decoder may be used to analyze local features, and a nonlocal decoder may be used to analyze nonlocal features. Local and nonlocal estimates may then be exchanged to improve the accuracy of the local and nonlocal decoders. Additional iterations of analysis and exchange may be performed until a predetermined threshold is reached. In some embodiments, the system may comprise extrinsic information extractors to prevent positive feedback loops from causing the system to adhere to erroneous previous decisions.
Deep belief network for large vocabulary continuous speech recognition
Tue, 03 Mar 2015 08:00:00 EST
A method is disclosed herein that includes an act of causing a processor to receive a sample, wherein the sample is one of spoken utterance, an online handwriting sample, or a moving image sample. The method also comprises the act of causing the processor to decode the sample based at least in part upon an output of a combination of a deep structure and a context-dependent Hidden Markov Model (HMM), wherein the deep structure is configured to output a posterior probability of a context-dependent unit. The deep structure is a Deep Belief Network consisting of many layers of nonlinear units with connecting weights between layers trained by a pretraining step followed by a fine-tuning step.
Signal processing apparatus having voice activity detection unit and related signal processing methods
Tue, 03 Mar 2015 08:00:00 EST
A signal processing apparatus includes a speech recognition system and a voice activity detection unit. The voice activity detection unit is coupled to the speech recognition system, and arranged for detecting whether an audio signal is a voice signal and accordingly generating a voice activity detection result to the speech recognition system to control whether the speech recognition system should perform speech recognition upon the audio signal.
Generating a masking signal on an electronic device
Tue, 03 Mar 2015 08:00:00 EST
An electronic device for generating a masking signal is described. The electronic device includes a plurality of microphones and a speaker. The electronic device also includes a processor and executable instructions stored in memory that is in electronic communication with the processor. The electronic device obtains a plurality of audio signals from the plurality of microphones. The electronic device also obtains an ambience signal based on the plurality of audio signals. The electronic device further determines an ambience feature based on the ambience signal. Additionally, the electronic device obtains a voice signal based on the plurality of audio signals. The electronic device also determines a voice feature based on the voice signal. The electronic device additionally generates a masking signal based on the voice feature and the ambience feature. The electronic device further outputs the masking signal using the speaker.
Enhancement of multichannel audio
Tue, 03 Mar 2015 08:00:00 EST
The invention relates to audio signal processing. More specifically, the invention relates to enhancing multichannel audio, such as television audio, by applying a gain to the audio that has been smoothed between portions of the audio. The invention relates to methods, apparatus for performing such methods, and to software stored on a computer-readable medium for causing a computer to perform such methods.
Decoding apparatus and method, encoding apparatus and method, and program
Tue, 03 Mar 2015 08:00:00 EST
The present invention relates to a decoding apparatus, a decoding method, an encoding apparatus, an encoding method, and programs that can shorten the delay time caused by the band extension at the time of decoding, and restrain increases in resources on the decoding side. A higher frequency component generating unit (73) generates a pseudo higher frequency spectrum by using a lower frequency spectrum (SP-L) and a higher frequency envelope (ENV-H). A phase randomizing unit (74) randomizes the phase of the pseudo higher frequency spectrum, based on a random flag (RND). An inverse MDCT unit (75) denormalizes the lower frequency spectrum (SP-L) by using a lower frequency envelope (ENV-L), and combines the pseudo higher frequency spectrum supplied from the phase randomizing unit (74) with the denormalized lower frequency spectrum (SP-L). The combination result is used as the spectrum of the entire band. The present invention can be applied to a decoding apparatus that performs band extension decoding, for example.
Band broadening apparatus and method
Tue, 03 Mar 2015 08:00:00 EST
A band broadening apparatus includes a processor configured to analyze a fundamental frequency based on an input signal bandlimited to a first band, generate a signal that includes a second band different from the first band based on the input signal, control a frequency response of the second band based on the fundamental frequency, reflect the frequency response of the second band on the signal that includes the second band and generate a frequency-response-adjusted signal that includes the second band, and synthesize the input signal and the frequency-response-adjusted signal.

Language Selection

linkedin        twitter

Company Search


Advanced Search   


Services


logo inttranet


logo inttrastats


logo Inttranews


logo Inttrasearch


logo linguists of the year