...A Planet of Partners™

  • Increase font size
  • Default font size
  • Decrease font size

Patents

 
Analysis filterbank, synthesis filterbank, encoder, de-coder, mixer and conferencing system
Tue, 19 May 2015 08:00:00 EDT
An embodiment of an analysis filterbank for filtering a plurality of time domain input frames, wherein an input frame comprises a number of ordered input samples, comprises a windower configured to generate a plurality of windowed frames, wherein a windowed frame comprises a plurality of windowed samples, wherein the windower is configured to process the plurality of input frames in an overlapping manner using a sample advance value, wherein the sample advance value is less than the number of ordered input samples of an input frame divided by two, and a time/frequency converter configured to provide an output frame comprising a number of output values, wherein an output frame is a spectral representation of a windowed frame.
Keyword assessment
Tue, 19 May 2015 08:00:00 EDT
Methods, systems, and techniques for keyword management are described. Some embodiments provide a keyword management system (“KMS”) configured to determine the effectiveness of multiple candidate keywords. In some embodiments, the KMS generates multiple candidate keywords based on an initial keyword. The KMS may then determine an effectiveness score for each of the candidate keywords, based on marketing information about those keywords. Next, the KMS may process the candidate keywords according to the determined effectiveness scores. In some embodiments, processing the candidate keywords includes applying rules that conditionally perform actions with respect to the candidate keywords, such as modifying advertising expenditures, modifying content, or the like.
Comparison of character strings
Tue, 19 May 2015 08:00:00 EDT
A computer-readable, non-transitory medium storing a character string comparison program is provided. The program causes, when executed by a computer, the computer to perform a process including splitting a first character string and a second character string into words; acquiring information including a semantic attribute that represents a semantic nature of each of the words and a conceptual code that semantically identifies said each of the words, from a storage device; identifying a pair of the words having a common semantic attribute between the first character string and the second character string; comparing the conceptual codes of the specified pair of the words between the first character string and the second character string; and generating a comparison result between the first character string and the second character string based upon a comparison result of the conceptual codes.
Method for classifying audio signal into fast signal or slow signal
Tue, 19 May 2015 08:00:00 EDT
Low bit rate audio coding such as BWE algorithm often encounters conflict goal of achieving high time resolution and high frequency resolution at the same time. In order to achieve best possible quality, input signal can be first classified into fast signal and slow signal. This invention focuses on classifying signal into fast signal and slow signal, based on at least one of the following parameters or a combination of the following parameters: spectral sharpness, temporal sharpness, pitch correlation (pitch gain), and/or spectral envelope variation. This classification information can help to choose different BWE algorithms, different coding algorithms, and different postprocessing algorithms respectively for fast signal and slow signal.
Using a physical phenomenon detector to control operation of a speech recognition engine
Tue, 19 May 2015 08:00:00 EDT
A device may include a physical phenomenon detector. The physical phenomenon detector may detect a physical phenomenon related to the device. In response to detecting the physical phenomenon, the device may record audio data that includes speech. The speech may be transcribed with a speech recognition engine. The speech recognition engine may be included in the device, or may be included with a remote computing device with which the device may communicate.
Method and system for facilitating communications for a user transaction
Tue, 19 May 2015 08:00:00 EDT
Current human-to-machine interfaces enable users to interact with a company's database and enter into a series of transactions (e.g., purchasing products/services and paying bills). Each transaction may require several operations or stages requiring user input or interaction. Some systems enable a user to enter a voice input parameter providing multiple operations of instruction (e.g., single natural language command). However, users of such a system do not know what types of commands the system is capable of accepting. Embodiments of the present invention facilitate communications for user transactions by determining a user's goal transaction and presenting a visual representation of a voice input parameter for the goal transaction. The use of visual representations notifies the user of the system's capability of accepting single natural language commands and the types of commands the system is capable of accepting, thereby enabling a user to complete a transaction in a shorter period of time.
Image processing apparatus and control method thereof and image processing system
Tue, 19 May 2015 08:00:00 EDT
An image processing apparatus including: image processor which processes broadcasting signal, to display image based on processed broadcasting signal; communication unit which is connected to a server; a voice input unit which receives a user's speech; a voice processor which processes a performance of a preset corresponding operation according to a voice command corresponding to the speech; and a controller which processes the voice command corresponding to the speech through one of the voice processor and the server if the speech is input through the voice input unit. If the voice command includes a keyword relating to a call sign of a broadcasting channel, the controller controls one of the voice processor and the server to select a recommended call sign corresponding to the keyword according to a predetermined selection condition, and performs a corresponding operation under the voice command with respect to the broadcasting channel of the recommended call sign.
Script compliance and quality assurance based on speech recognition and duration of interaction
Tue, 19 May 2015 08:00:00 EDT
Apparatus and methods are provided for using automatic speech recognition to analyze a voice interaction and verify compliance of an agent reading a script to a client during the voice interaction. In one aspect of the invention, a communications system includes a user interface, a communications network, and a call center having an automatic speech recognition component. In other aspects of the invention, a script compliance method includes the steps of conducting a voice interaction between an agent and a client and evaluating the voice interaction with an automatic speech recognition component adapted to analyze the voice interaction and determine whether the agent has adequately followed the script. In yet still further aspects of the invention, the duration of a given interaction can be analyzed, either apart from or in combination with the script compliance analysis above, to seek to identify instances of agent non-compliance, of fraud, or of quality-analysis issues.
Automated communication integrator
Tue, 19 May 2015 08:00:00 EDT
An apparatus includes a plurality of applications and an integrator having a voice recognition module configured to identify at least one voice command from a user. The integrator is configured to integrate information from a remote source into at least one of the plurality of applications based on the identified voice command. A method includes analyzing speech from a first user of a first mobile device having a plurality of applications, identifying a voice command based on the analyzed speech using a voice recognition module, and incorporating information from the remote source into at least one of a plurality of applications based on the identified voice command.
Sound localization for user in motion
Tue, 19 May 2015 08:00:00 EDT
Methods, apparatus, and computer programs for simulating the source of sound are provided. One method includes operations for determining a location in space of the head of a user utilizing face recognition of images of the user. Further, the method includes an operation for determining a sound for two speakers, and an operation for determining an emanating location in space for the sound, each speaker being associated with one ear of the user. The acoustic signals for each speaker are established based on the location in space of the head, the sound, the emanating location in space, and the auditory characteristics of the user. In addition, the acoustic signals are transmitted to the two speakers. When the acoustic signals are played by the two speakers, the acoustic signals simulate that the sound originated at the emanating location in space.
Speech effects
Tue, 19 May 2015 08:00:00 EDT
A method of complementing a spoken text. The method including receiving text data representative of a natural language text, receiving effect control data including at least one effect control record, each effect control record being associated with a respective location in the natural language text, receiving a stream of audio data, analyzing the stream of audio data for natural language utterances that correlate with the natural language text at a respective one of the locations, and outputting, in response to a determination by the analyzing that a natural language utterance in the stream of audio data correlates with a respective one of the locations, at least one effect control signal based on the effect control record associated with the respective location.
Email administration for rendering email on a digital audio player
Tue, 19 May 2015 08:00:00 EDT
Methods, systems, and computer program products are provided for email administration for rendering email on a digital audio player. Embodiments include retrieving an email message; extracting text from the email message; creating a media file; and storing the extracted text of the email message as metadata associated with the media file. Embodiments may also include storing the media file on a digital audio player and displaying the metadata describing the media file, the metadata containing the extracted text of the email message.
Automatic disclosure detection
Tue, 19 May 2015 08:00:00 EDT
A method of detecting pre-determined phrases to determine compliance quality is provided. The method includes determining whether at least one of an event or a precursor event has occurred based on a comparison between pre-determined phrases and a communication between a sender and a recipient in a communications network, and rating the recipient based on the presence of the pre-determined phrases associated with the event or the presence of the pre-determined phrases associated with the precursor event in the communication.
Computing numeric representations of words in a high-dimensional space
Tue, 19 May 2015 08:00:00 EDT
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for computing numeric representations of words. One of the methods includes obtaining a set of training data, wherein the set of training data comprises sequences of words; training a classifier and an embedding function on the set of training data, wherein training the embedding function comprises obtained trained values of the embedding function parameters; processing each word in the vocabulary using the embedding function in accordance with the trained values of the embedding function parameters to generate a respective numerical representation of each word in the vocabulary in the high-dimensional space; and associating each word in the vocabulary with the respective numeric representation of the word in the high-dimensional space.
Efficient exploitation of model complementariness by low confidence re-scoring in automatic speech recognition
Tue, 19 May 2015 08:00:00 EDT
A method for speech recognition is described that uses an initial recognizer to perform an initial speech recognition pass on an input speech utterance to determine an initial recognition result corresponding to the input speech utterance, and a reliability measure reflecting a per word reliability of the initial recognition result. For portions of the initial recognition result where the reliability of the result is low, a re-evaluation recognizer is used to perform a re-evaluation recognition pass on the corresponding portions of the input speech utterance to determine a re-evaluation recognition result corresponding to the re-evaluated portions of the input speech utterance. The initial recognizer and the re-evaluation recognizer are complementary so as to make different recognition errors. A final recognition result is determined based on the re-evaluation recognition result if any, and otherwise based on the initial recognition result.
User intention based on N-best list of recognition hypotheses for utterances in a dialog
Tue, 19 May 2015 08:00:00 EDT
Disclosed herein are systems, computer-implemented methods, and tangible computer-readable media for using alternate recognition hypotheses to improve whole-dialog understanding accuracy. The method includes receiving an utterance as part of a user dialog, generating an N-best list of recognition hypotheses for the user dialog turn, selecting an underlying user intention based on a belief distribution across the generated N-best list and at least one contextually similar N-best list, and responding to the user based on the selected underlying user intention. Selecting an intention can further be based on confidence scores associated with recognition hypotheses in the generated N-best lists, and also on the probability of a user's action given their underlying intention. A belief or cumulative confidence score can be assigned to each inferred user intention.
Methods and systems for dictation and transcription
Tue, 19 May 2015 08:00:00 EDT
Automated delivery and filing of transcribed material prepared from dictated audio files into a central record-keeping system are presented. A user dictates information from any location, uploads that audio file to a transcriptionist to be transcribed, and the transcribed material is automatically delivered into a central record keeping system, filed with the appropriate client or matter file, and the data stored in the designated appropriate fields within those client or matter files. Also described is the recordation of meetings from multiple sources using mobile devices and the detection of the active or most prominent speaker at given intervals in the meeting. Further, text boxes on websites are completed using an audio recording application and offsite transcription.
Dynamic long-distance dependency with conditional random fields
Tue, 19 May 2015 08:00:00 EDT
Dynamic features are utilized with CRFs to handle long-distance dependencies of output labels. The dynamic features present a probability distribution involved in explicit distance from/to a special output label that is pre-defined according to each application scenario. Besides the number of units in the segment (from the previous special output label to the current unit), the dynamic features may also include the sum of any basic features of units in the segment. Since the added dynamic features are involved in the distance from the previous specific label, the searching lattice associated with Viterbi searching is expanded to distinguish the nodes with various distances. The dynamic features may be used in a variety of different applications, such as Natural Language Processing, Text-To-Speech and Automatic Speech Recognition. For example, the dynamic features may be used to assist in prosodic break and pause prediction.
Selection of text prediction results by an accessory
Tue, 19 May 2015 08:00:00 EDT
A method for entering text in a text input field using a non-keyboard type accessory includes selecting a character for entry into the text field presented by a portable computing device. The portable computing device determines whether text suggestions are available based on the character. If text suggestions are available, the portable computing device can determine the text suggestions and send them to the accessory, which in turn can display the suggestions on a display. A user operating the accessory can select one of the text suggestions, expressly reject the text suggestions, or ignore the text suggestions. If a text suggestion is selected, the accessory can send the selected text to the portable computing device for populating the text field.
Systems, methods, apparatus, and computer-readable media for spatially selective audio augmentation
Tue, 19 May 2015 08:00:00 EDT
Spatially selective augmentation of a multichannel audio signal is described.
Audio codec supporting time-domain and frequency-domain coding modes
Tue, 19 May 2015 08:00:00 EDT
An audio codec supporting both, time-domain and frequency-domain coding modes, having low-delay and an increased coding efficiency in terms of iterate/distortion ratio, is obtained by configuring the audio encoder such that same operates in different operating modes such that if the active operative mode is a first operating mode, a mode dependent set of available frame coding modes is disjoined to a first subset of time-domain coding modes, and overlaps with a second subset of frequency-domain coding modes, whereas if the active operating mode is a second operating mode, the mode dependent set of available frame coding modes overlaps with both subsets, i.e. the subset of time-domain coding modes as well as the subset of frequency-domain coding modes.
Method and apparatus for audio coding and decoding
Tue, 19 May 2015 08:00:00 EDT
An encoder and decoder for processing an audio signal including generic audio and speech frames are provided herein. During operation, two encoders are utilized by the speech coder, and two decoders are utilized by the speech decoder. The two encoders and decoders are utilized to process speech and non-speech (generic audio) respectively. During a transition between generic audio and speech, parameters that are needed by the speech decoder for decoding frame of speech are generated by processing the preceding generic audio (non-speech) frame for the necessary parameters. Because necessary parameters are obtained by the speech coder/decoder, the discontinuities associated with prior-art techniques are reduced when transitioning between generic audio frames and speech frames.
Limiting notification interruptions
Tue, 19 May 2015 08:00:00 EDT
Techniques for a computing device operating in limited-access states are provided. One example method includes determining, by a computing device, that a notification is scheduled for output by the computing device during a first time period and that a pattern of audio detected during the first time period is indicative of human speech. The method further includes delaying output of the notification during the first time period and determining that a pattern of audio detected during a second time period is not indicative of human speech. The method also includes outputting at least a portion of the notification at an earlier in time of an end of the second time period or an expiration of a third time period.
Efficient coding of overcomplete representations of audio using the modulated complex lapped transform (MCLT)
Tue, 19 May 2015 08:00:00 EDT
An “Overcomplete Audio Coder” provides various techniques for overcomplete encoding audio signals using an MCLT-based predictive coder. Specifically, the Overcomplete Audio Coder uses unrestricted polar quantization of MCLT magnitude and phase coefficients. Further, quantized magnitude and phase coefficients are predicted based on properties of the audio signal and corresponding MCLT coefficients to reduce the bit rate overhead in encoding the audio signal. This prediction allows the Overcomplete Audio Coder to provide improved continuity of the magnitude of spectral components across encoded signal blocks, thereby reducing warbling artifacts. Coding rates achieved using these prediction techniques are comparable to that of encoding an orthogonal representation of an audio signal, such as with modulated lapped transform (MLT)-based coders. Finally, the Overcomplete Audio Coder provides a true magnitude-phase frequency-domain representation of the audio signal, thus allowing precise auditory models to be applied for improving compression performance, without the need for additional Fourier transforms.
Embedder for embedding a watermark into an information representation, detector for detecting a watermark in an information representation, method and computer program
Tue, 19 May 2015 08:00:00 EDT
An embedder for embedding a watermark to be embedded into an input information representation comprises an embedding parameter determiner that is implemented to apply a derivation function once or several times to an initial value to obtain an embedding parameter for embedding the watermark into the input information representation. Further, the embedder comprises a watermark adder that is implemented to provide the input information representation with the watermark using the embedding parameter. The embedder is implemented to select how many times the derivation function is to be applied to the initial value.
Relation topic construction and its application in semantic relation extraction
Tue, 19 May 2015 08:00:00 EDT
Systems and method automatically collect training data from manually created semantic relations, automatically extract rules from the training data to produce extracted rules, and automatically characterize existing semantic relations in the training data based on co-occurrence of the extracted rules in the existing semantic relations. Such systems and methods automatically construct semantic relation topics based on the characterization of the existing semantic relations, and group instances of the training data into the semantic relation topics to detect new semantic relations.
Systems and methods for multiple mode voice and data communications using intelligently bridged TDM and packet buses and methods for implementing language capabilities using the same
Tue, 19 May 2015 08:00:00 EDT
A system for managing voice communications provides voice prompts in one or more particular languages or language variants. High level grammatical rules are defined for the set of voice prompts in the particular language. The grammatical rules for the set of voice prompts are stored in the system. A set of audio tiles are developed in the particular language. The audio files in the particular language are stored in the system. A request is received from a user, and the system initiates a request for a voice prompt. A sequential list of audio files is developed, and when the sequential list of audio files is played by the system, the requested voice prompt is played to the user in the particular language. The sequential list of audio files is produced based on the grammatical rules and voice communications are managed in the system.
Text overlay techniques in realtime translation
Tue, 19 May 2015 08:00:00 EDT
Architecture that employs techniques for overlaying (superimposing) translated text on top of (over) scanned text in realtime translation to provide clear visual correlation between original text and translated text. Algorithms are provided that overlay text in cases of translated scanned text of language written in first direction to a language written in same direction, translate scanned text from a first language written in a first direction to a second language written in the opposite direction, and translated scanned text from a language written in a first direction to language written in a different direction.
Method for establishing paraphrasing data for machine translation system
Tue, 19 May 2015 08:00:00 EDT
A method for establishing paraphrasing data for a machine translation system includes selecting a paraphrasing target sentence through application of an object language model to a translated sentence that is obtained by machine-translating a source language sentence, extracting paraphrasing candidates that can be paraphrased with the paraphrasing target sentence from a source language corpus DB, performing machine translation with respect to the paraphrasing candidates, selecting a final paraphrasing candidate by applying the object language model to the result of the machine translation with respect to the paraphrasing candidates, and confirming the paraphrasing target sentence and the final paraphrasing candidate as paraphrasing lexical patterns using a bilingual corpus and storing the paraphrasing lexical patterns in a paraphrasing DB. According to the present invention, the consistent paraphrasing data can be established since the paraphrasing data is automatically established.
Multi-device video communication session
Tue, 19 May 2015 08:00:00 EDT
A method of adding a computing device to a multi-device video communication session. A server receives recorded content from a plurality of multi-device video communication sessions and a search request from a computing device. The server identifies a first multi-device video communication session based on the search request. The first multi-device video communication session includes a weighted list of text elements. The server transmits information based on the weighted list of text elements to the computing device, receives a selection from the computing device corresponding to a first text element, and transmits at least a portion of the recorded content from the first multi-device video communication session to the computing device based on the first text element. The server receives an add request for the computing device to be added to the first multi-device video communication session and transmits the add request to the first multi-device video communication session.

Language Selection

linkedin        twitter

Company Search


Advanced Search   


Services


logo inttranet


logo inttrastats


logo Inttranews


logo Inttrasearch


logo linguists of the year