State-of-the-Art Language Modelling Results are Possible with Simple Architectures
Statistical language models apply probability distributions to a sequence of words. These models are finding increasing use as natural language processing applications become more ubiquitous. A wide range of applications such as speech recognition, machine translation, part-of-speech tagging, chatbot implementations, handwriting recognition, information retrieval, and others, use language models.
Today, language models enable users to ask Siri where the nearest restaurant is, or walk into a dark kitchen and ask Alexa to switch on smart lights. Google recently demoed an AI agent that called businesses to book appointments. Language modelling is turning what was once science fiction into reality.
The models operate by predicting subsequent tokens from the data provided by the preceding tokens. Natural language modelling is by nature more complex than formal or programming language modelling where word usage can be precisely defined. Natural languages are not designed and do not have a formal specification. Even as natural languages have large numbers of terms and multiple ways to use them, these ambiguities are a challenge for machine learning.
Language models can be classified as either character-level or word-level. While character-level models have the advantage of requiring less memory (e.g., 26 letters in the English alphabet) when compared to word-level modes (171,476 words in use according to the Oxford Dictionary), character-level models are constrained by the vanishing gradients problem encountered by neural networks.
The vanishing gradient problem is machine learning issue that is present when training artificial neural networks with gradient-based learning methods and backpropagation. At each iteration, the neural network weights receives an update proportional to the partial derivative of the error function with respect to the current weight. In some cases, the gradient becomes vanishingly small to the point that weight effectively cannot change its value.
Recent research has shown that LSTM or QRNNs can be tuned to achieve state-of-the-art results on both character and word-level datasets using modern GPUs. While a recurrent neural network (RNN) exhibits temporal dynamic behavior due to the connections between nodes forming a directed graph along a temporal sequence, a quasi-recurrent neural network (QRNN) allows multiple parallel calculations.
This results in improved performance making QRNNs ideal for applications such as language modelling. LSTMs display a further improvement over QRNNs in language modelling. Since an LSTM stores information in memory, this allows RNNs to remember inputs over time, similar to the memory functions of a computer. LSTMs are therefore able to read, write, and delete the information it stores. In conclusion, using LSTMs and QRNNs can deliver state-of-the-art results with word level or character level models without relying on complex architectures.
Contact Fusion Professionals and let’s discuss how we can help explore and deploy these research-based language modelling applications as well as a range of other best in class data analytics technologies that can further enhance your company’s business intelligence.
Statistical language models apply probability distributions to a sequence of words. These models are finding increasing use as natural language…MORE INFORMATION
Challenges The Company, one of Australia’s largest and fastest growing Telco companies had 2 primary SharePoint environments that had different…MORE INFORMATION
Containerization allows applications to run on any machine- anytime, anywhere so long as they are compatible. By virtualizing your OS,…MORE INFORMATION
So you’ve finally decided that the cloud is safer than corporate data centers and digital assets and you’ve chosen to…MORE INFORMATION
Building a system that houses your organisation’s data can be daunting, especially now that data acquisition is growing rapidly. The…MORE INFORMATION
Human-to-machine communication has not yet been perfected, but enterprises are already beginning to integrate this groundbreaking technology into their operations,…MORE INFORMATION
Fusion Professionals has signed a partnership agreement with MapR Technologies, provider of the industry’s leading data platform for AI and…MORE INFORMATION
“Big data is at the foundation of all of the megatrends that are happening today, from social to mobile to…MORE INFORMATION
In recent years data volumes have been increasing dramatically. This has created major challenges for traditional analytics platforms in terms…MORE INFORMATION
With the increasing volumes of data that can be cost effectively stored in the cloud, comes increasing responsibility. The current…MORE INFORMATION
With the advancement of technology and abundance of data your business receives on a daily basis, companies are now in…MORE INFORMATION
Fusion Professionals held its annual Fusion Summit last Thursday the 18th of October at the Rag and Famish Hotel in…MORE INFORMATION
The Client is one of major NSW government departments providing services to public. The Department had been experiencing performance issues…MORE INFORMATION
Though its conception dates back to 1979, containers made their mark as much needed, major technology assets in 2000. Digital…MORE INFORMATION
Objective The intelligent mobile app-based lending system is a new field, blending recent technical developments in mobile phones and Artificial…MORE INFORMATION
Our Client is a well-known Australian freight logistics company, operating in railway freight and shipping. The company embarked on a…MORE INFORMATION
Data warehouse management and data analytics always had the challenge to decide what data to store and for how long…MORE INFORMATION
Cloud computing is becoming a preferred storage platform for IT managers and organisations in general. In Australia alone, 31 percent…MORE INFORMATION
Serving your customer in the best possible, most efficient way should always be the major goal of any organisation. The…MORE INFORMATION
Moving out from proprietary software seems like a daredevil act, considering the possible data security issues some open source databases…MORE INFORMATION