...A Planet of Partners™

  • Increase font size
  • Default font size
  • Decrease font size

Patents

 
Method for changing over from a first adaptive data processing version to a second adaptive data processing version
Tue, 14 Apr 2015 08:00:00 EDT
The invention relates to a method and to a system for changing over from a first adaptive data processing version (V1) on data processing means using at least one data model (dm) which is continuously adapted on the basis of data processing results to a second adaptive data processing version (V2) also using at least one data model (DM) to be continuously adapted, characterized in that, in a first phase, the second adaptive data processing version (V2) is used in parallel to the first data processing version (V1), thereby continuously adapting said at least one data model (dm) related to the first version (V1) as well as that data model (DM) related to the second version (V2), and in that the performance of data processing by means of the second version (V2) in checked to comply with a quality criterion, where after in a second phase, as soon as said criterion is met, the results of the data processing by means of the second version (V2) are outputted to be used. The invention further relates to a computer program product having a computer program recorded thereon which is adapted to carry out such a method.
Instant messaging association to remote desktops
Tue, 14 Apr 2015 08:00:00 EDT
A remote desktop capability includes a message area on the agent's remote desktop display. Incoming instant messages on an instant messaging application on the agent's primary desktop are passed through to the message area on the remote desktop display.
Method of searching text based on two computer hardware processing properties: indirect memory addressing and ASCII encoding
Tue, 14 Apr 2015 08:00:00 EDT
A method and process for searching and inserting a word or set of words in a large data set for real-time data intensive search applications using memory banks is disclosed. Traditional search methods optimize time and space by pre-sorting the data so that fast search can be accomplished. Unfortunately, in real-time search intensive applications, it is almost impossible to take a snapshot of the data set during real-time while the transactions are happening to sort and search for a word or set of words. The instant method and process is an innovative way to use unordered list for searching the data real-time without the requirement to pre-sort the data. The time complexity of the proposed method is very fast. In addition, the proposed method does both insertion and searching reducing the code complexity and time using indirect addressing in memory banks.
Audio encoding and decoding to generate binaural virtual spatial signals
Tue, 14 Apr 2015 08:00:00 EDT
An audio encoder comprises a multi-channel receiver (401) which receives an M-channel audio signal where M>2. A down-mix processor (403) down-mixes the M-channel audio signal to a first stereo signal and associated parametric data and a spatial processor (407) modifies the first stereo signal to generate a second stereo signal in response to the associated parametric data and spatial parameter data for a binaural perceptual transfer function, such as a Head Related Transfer Function (HRTF). The second stereo signal is a binaural signal and may specifically be a (3D) virtual spatial signal. An output data stream comprising the encoded data and the associated parametric data is generated by an encode processor (411) and an output processor (413). The HRTF processing may allow the generation of a (3D) virtual spatial signal by conventional stereo decoders. A multi-channel decoder may reverse the process of the spatial processor (407) to generate an improved quality multi-channel signal.
Voice dialog system with reject avoidance process
Tue, 14 Apr 2015 08:00:00 EDT
The invention relates to a process for operating a voice dialog system and a voice dialog system which can be controlled over a telecommunications link by a communications terminal, a speech element transmitted by the communications terminal being received by a receiving unit of the voice dialog system and being analyzed for statement content in a processing unit, the speech element being filed in a memory assigned to the processing unit and after the telecommunications link is broken being analyzed by the processing unit.
Hosted voice recognition system for wireless devices
Tue, 14 Apr 2015 08:00:00 EDT
Methods, systems, and software for converting the audio input of a user of a hand-held client device or mobile phone into a textual representation by means of a backend server accessed by the device through a communications network. The text is then inserted into or used by an application of the client device to send a text message, instant message, email, or to insert a request into a web-based application or service. In one embodiment, the method includes the steps of initializing or launching the application on the device; recording and transmitting the recorded audio message from the client device to the backend server through a client-server communication protocol; converting the transmitted audio message into the textual representation in the backend server; and sending the converted text message back to the client device or forwarding it on to an alternate destination directly from the server.
Program endpoint time detection apparatus and method, and program information retrieval system
Tue, 14 Apr 2015 08:00:00 EDT
This invention relates to retrieval for multimedia content, and provides a program endpoint time detection apparatus for detecting an endpoint time of a program by performing processing on audio signals of said program, comprising an audio classification unit for classifying said audio signals into a speech signal portion and a non-speech signal portion; a keyword retrieval unit for retrieving, as a candidate endpoint keyword, an endpoint keyword indicating start or end of the program from said speech signal portion; a content analysis unit for performing content analysis on context of the candidate endpoint keyword retrieved by the keyword retrieval unit to determine whether the candidate endpoint keyword is a valid endpoint keyword; and a program endpoint time determination unit for performing statistics analysis based on the retrieval result of said keyword retrieval unit and the determination result of said content analysis unit, and determining the endpoint time of the program. In addition, this invention also provides a program information retrieval system. With present invention, program information regarding a program attended by user can be rapidly obtained.
Multisensory speech detection
Tue, 14 Apr 2015 08:00:00 EDT
A computer-implemented method of multisensory speech detection is disclosed. The method comprises determining an orientation of a mobile device and determining an operating mode of the mobile device based on the orientation of the mobile device. The method further includes identifying speech detection parameters that specify when speech detection begins or ends based on the determined operating mode and detecting speech from a user of the mobile device based on the speech detection parameters.
System and method for singing synthesis capable of reflecting voice timbre changes
Tue, 14 Apr 2015 08:00:00 EDT
Herein provided is a system for singing synthesis capable of reflecting not only pitch and dynamics changes but also timbre changes of a user's singing. A spectral transform surface generating section 119 temporally concatenates all the spectral transform curves estimated by a second spectral transform curve estimating section 117 to define a spectral transform surface. A synthesized audio signal generating section 121 generates a transform spectral envelope at each instant of time by scaling a reference spectral envelope based on the spectral transform surface. Then, the synthesized audio signal generating section 121 generates an audio signal of a synthesized singing voice reflecting timbre changes of an input singing voice, based on the transform spectral envelope and a fundamental frequency contained in a reference singing voice source data.
Apparatus, method, and program for reading aloud documents based upon a calculated word presentation order
Tue, 14 Apr 2015 08:00:00 EDT
According to one embodiment, a reading aloud support apparatus includes a reception unit, a first extraction unit, a second extraction unit, an acquisition unit, a generation unit, a presentation unit. The reception unit is configured to receive an instruction. The first extraction unit is configured to extract, as a partial document, a part of a document which corresponds to a range of words. The second extraction unit is configured to perform morphological analysis and to extract words as candidate words. The acquisition unit is configured to acquire attribute information items relates to the candidate words. The generation unit is configured to perform weighting relating to a value corresponding a distance and to determine each of candidate words to be preferentially presented to generate a presentation order. The presentation unit is configured to present the candidate words and the attribute information items in accordance with the presentation order.
System and method for cloud-based text-to-speech web services
Tue, 14 Apr 2015 08:00:00 EDT
Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating speech. One variation of the method is from a server side, and another variation of the method is from a client side. The server side method, as implemented by a network-based automatic speech processing system, includes first receiving, from a network client independent of knowledge of internal operations of the system, a request to generate a text-to-speech voice. The request can include speech samples, transcriptions of the speech samples, and metadata describing the speech samples. The system extracts sound units from the speech samples based on the transcriptions and generates an interactive demonstration of the text-to-speech voice based on the sound units, the transcriptions, and the metadata, wherein the interactive demonstration hides a back end processing implementation from the network client. The system provides access to the interactive demonstration to the network client.
Recognition of speech with different accents
Tue, 14 Apr 2015 08:00:00 EDT
Computer-based speech recognition can be improved by recognizing words with an accurate accent model. In order to provide a large number of possible accents, while providing real-time speech recognition, a language tree data structure of possible accents is provided in one embodiment such that a computerized speech recognition system can benefit from choosing among accent categories when searching for an appropriate accent model for speech recognition.
Method, medium, and system detecting speech using energy levels of speech frames
Tue, 14 Apr 2015 08:00:00 EDT
A speech recognition method, medium, and system. The method includes detecting an energy change of each frame making up signals including speech and non-speech signals, and identifying a speech segment corresponding to frames that include only speech signals from among the frames based on the detected energy change.
Specific call detecting device and specific call detecting method
Tue, 14 Apr 2015 08:00:00 EDT
A specific call detecting device includes: an utterance period detecting unit which detects at least a first utterance period in which the first speaker speaks in a call between a first speaker and a second speaker; an utterance ratio calculating unit which calculates utterance ratio of the first speaker in the call; a voice recognition execution determining unit which determines whether at least one of the first voice of the first speaker and second voice of the second speaker becomes a target of voice recognition or not on the basis of the utterance ratio of the first speaker; a voice recognizing unit which detects a keyword related to a specific call from the voice determined as a target of voice recognition among the first and second voices; and a determining unit which determines whether the call is the specific call or not on the basis of the detected keyword.
System and method for disambiguating multiple intents in a natural language dialog system
Tue, 14 Apr 2015 08:00:00 EDT
The present invention addresses the deficiencies in the prior art by providing an improved dialog for disambiguating a user utterance containing more than one intent. The invention comprises methods, computer-readable media, and systems for engaging in a dialog. The method embodiment of the invention relates to a method of disambiguating a user utterance containing at least two user intents. The method comprises establishing a confidence threshold for spoken language understanding to encourage that multiple intents are returned, determining whether a received utterance comprises a first intent and a second intent and, if the received utterance contains the first intent and the second intent, disambiguating the first intent and the second intent by presenting a disambiguation sub-dialog wherein the user is offered a choice of which intent to process first, wherein the user is first presented with the intent of the first or second intents having the lowest confidence score.
Model-driven candidate sorting
Tue, 14 Apr 2015 08:00:00 EDT
Methods and systems for model-driven candidate sorting for evaluating digital interviews are described. In one embodiment, a model-driven candidate-sorting tool selects a data set of digital interview data for sorting. The data set includes candidate for interviewing candidates (also referred to herein as interviewees). The model-driven candidate-sorting tool analyzes the candidate data for the respective interviewing candidate to identify digital interviewing cues and applies the digital interview cues to a prediction model to predict an achievement index for the respective interviewing candidate. This is performed without reviewer input at the model-driven candidate-sorting tool. The list of interview candidates is sorted according the predicted achievement indices and the sorted list is presented to the reviewer in a user interface.
Multiple subspace discriminative feature training
Tue, 14 Apr 2015 08:00:00 EDT
Methods and apparatus related to speech recognition performed by a speech recognition device are disclosed. The speech recognition device can receive a plurality of samples corresponding to an utterance and generate a feature vector z from the plurality of samples. The speech recognition device can select a first frame y0 from the feature vector z, and can generate a second frame y1, where y0 and y1 differ. The speech recognition device can generate a modified frame x′ based on the first frame y0 and the second frame y1 and then recognize speech related to the utterance based on the modified frame x′. The recognized speech can be output by the speech recognition device.
Pattern processing system specific to a user group
Tue, 14 Apr 2015 08:00:00 EDT
Methods and apparatus for identifying a user group in connection with user group-based speech recognition. An exemplary method comprises receiving, from a user, a user group identifier that identifies a user group to which the user was previously assigned based on training data. The user group comprises a plurality of individuals including the user. The method further comprises using the user group identifier, identifying a pattern processing data set corresponding to the user group, and receiving speech input from the user to be recognized using the pattern processing data set.
Machine translation of indirect speech
Tue, 14 Apr 2015 08:00:00 EDT
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating direct speech messages based on voice commands that include indirect speech messages. In one aspect, a method includes receiving a voice input corresponding to an utterance. A determination is made whether a transcription of the utterance includes a command to initiate a communication to a user and a segment that is classified as indirect speech. In response to determining that the transcription of the utterance includes the command and the segment that is classified as indirect speech, the segment that is classified as indirect speech is provided as input to a machine translator. In response to providing the segment that is classified as indirect speech to the machine translator, a direct speech segment is received from the machine translator. A communication is initiated that includes the direct speech segment.
Systems and methods for improving the accuracy of a transcription using auxiliary data such as personal data
Tue, 14 Apr 2015 08:00:00 EDT
A method is described for improving the accuracy of a transcription generated by an automatic speech recognition (ASR) engine. A personal vocabulary is maintained that includes replacement words. The replacement words in the personal vocabulary are obtained from personal data associated with a user. A transcription is received of an audio recording. The transcription is generated by an ASR engine using an ASR vocabulary and includes a transcribed word that represents a spoken word in the audio recording. Data is received that is associated with the transcribed word. A replacement word from the personal vocabulary is identified, which is used to re-score the transcription and replace the transcribed word.
Training a transcription system
Tue, 14 Apr 2015 08:00:00 EDT
According to certain embodiments, training a transcription system includes accessing recorded voice data of a user from one or more sources. The recorded voice data comprises voice samples. A transcript of the recorded voice data is accessed. The transcript comprises text representing one or more words of each voice sample. The transcript and the recorded voice data are provided to a transcription system to generate a voice profile for the user. The voice profile comprises information used to convert a voice sample to corresponding text.
Noise adaptive training for speech recognition
Tue, 14 Apr 2015 08:00:00 EDT
Technologies are described herein for noise adaptive training to achieve robust automatic speech recognition. Through the use of these technologies, a noise adaptive training (NAT) approach may use both clean and corrupted speech for training. The NAT approach may normalize the environmental distortion as part of the model training. A set of underlying “pseudo-clean” model parameters may be estimated directly. This may be done without point estimation of clean speech features as an intermediate step. The pseudo-clean model parameters learned from the NAT technique may be used with a Vector Taylor Series (VTS) adaptation. Such adaptation may support decoding noisy utterances during the operating phase of a automatic voice recognition system.
Method and system for analyzing digital sound audio signal associated with baby cry
Tue, 14 Apr 2015 08:00:00 EDT
A method for analyzing a digital audio signal associated with a baby cry, comprising the steps of: (a) processing the digital audio signal using a spectral analysis to generate a spectral data; (b) processing the digital audio signal using a time-frequency analysis to generate a time-frequency characteristic; (c) categorizing the baby cry into one of a basic type and a special type based on the spectral data; (d) if the baby cry is of the basic type, determining a basic need based on the time-frequency characteristic and a predetermined lookup table; and (e) if the baby cry is of the special type, determining a special need by inputting the time-frequency characteristic into a pre-trained artificial neural network.
Encoding device, decoding device, and methods therefor
Tue, 14 Apr 2015 08:00:00 EDT
Disclosed is an encoding device that improves the quality of a decoded signal in a hierarchical coding (scalable coding) method, wherein a band to be quantized is selected for every level (layer). The encoding device (101) is equipped with a second layer encoding unit (205) that selects a first band to be quantized of a first input signal from among a plurality of sub-bands, and that generates second layer encoding information containing first band information of said band; a second layer decoding unit (206) that generates a first decoded signal using the second layer encoding information; an addition unit (207) that generates a second input signal using the first input signal and the first decoded signal; and a third layer encoding unit (208) that selects a second band to be quantized of the second input signal using the first decoded signal, and that generates third layer encoding information.
Methods and systems for bit allocation and partitioning in gain-shape vector quantization for audio coding
Tue, 14 Apr 2015 08:00:00 EDT
Embodiments are generally directed to systems and methods for bit allocation and band partitioning for gain-shape vector quantization in an audio codec. An audio codec implements a method that uses an implicit, dynamic scheme to allow an encoder and decoder to recreate a series of bit allocation decisions for gain and shape without transmitting additional side information for each decision, based on the number of bits that are left remaining and available in a given packet. For implementation in practical codecs, the band comprising the allocation of bits for the shape is recursively split into equal partitions until the number of bits allocated to each partition is less than the maximum codebook size.
Method for processing multichannel acoustic signal, system therefor, and program
Tue, 14 Apr 2015 08:00:00 EDT
A method for processing multichannel acoustic signals which processes input signals of a plurality of channels including the voices of a plurality of speaking persons. The method is characterized by detecting the voice section of each speaking person or each channel, detecting overlapped sections wherein the detected voice sections are common between channels, determining a channel to be subjected to crosstalk removal and the section thereof by use of at least voice sections not including the detected overlapped sections, and removing crosstalk in the sections of the channel to be subjected to the crosstalk removal.
Voice activity detection/silence suppression system
Tue, 14 Apr 2015 08:00:00 EDT
A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel.
Real-time voice recognition on a handheld device
Tue, 14 Apr 2015 08:00:00 EDT
A method and apparatus for implementation of real-time speech recognition using a handheld computing apparatus are provided. The handheld computing apparatus receives an audio signal, such as a user's voice. The handheld computing apparatus ultimately transmits the voice data to a remote or distal computing device with greater processing power and operating a speech recognition software application. The speech recognition software application processes the signal and outputs a set of instructions for implementation either by the computing device or the handheld apparatus. The instructions can include a variety of items including instructing the presentation of a textual representation of dictation, or a function or command to be executed by the handheld device (such as linking to a website, opening a file, cutting, pasting, saving, or other file menu type functionalities), or by the computing device itself.
Method and system for performing sample rate conversion
Tue, 14 Apr 2015 08:00:00 EDT
A method and system for performing sample rate conversion is provided. The method may include configuring a system to convert a sample rate of a first audio channel of a plurality of audio channels to produce a first audio stream of samples. The system may be dynamically reconfigured to convert a sample rate of a second of the plurality of audio channels to produce a second audio stream of samples, wherein the first and second audio streams are output from the system at the same time. The method may further include arbitrating between request for additional data from the first and second audio stream of samples, where processing of the first channel is suspended when the request corresponds to a second channel that is of higher priority.
Analyzing a category of a candidate phrase to update from a server if a phrase category is not in a phrase database
Tue, 14 Apr 2015 08:00:00 EDT
The embodiments of the present invention provide an output method and electronic apparatus for a candidate phrase and an electronic apparatus. The method includes: analyzing, according to phrase categories in a phrase database, category of a phrase selected from a candidate input list that appeared after a user inputs a syllable, so as to judge whether the category of the phrase is contained in the phrase database; increasing the candidate priority of the category containing the phrase in a candidate input list, if the category of the phrase is contained in the phrase database; and transmitting the phase to a text input server, if the category of the phrase is not contained in the phrase database, so as to update the phrase categories in the phrase database according to an instruction from the text input server.

Language Selection

linkedin        twitter

Company Search


Advanced Search   


Services


logo inttranet


logo inttrastats


logo Inttranews


logo Inttrasearch


logo linguists of the year