State-of-the-Art Language Modelling Results are Possible with Simple Architectures
Statistical language models apply probability distributions to a sequence of words. These models are finding increasing use as natural language processing applications become more ubiquitous. A wide range of applications such as speech recognition, machine translation, part-of-speech tagging, chatbot implementations, handwriting recognition, information retrieval, and others, use language models.
Today, language models enable users to ask Siri where the nearest restaurant is, or walk into a dark kitchen and ask Alexa to switch on smart lights. Google recently demoed an AI agent that called businesses to book appointments. Language modelling is turning what was once science fiction into reality.
The models operate by predicting subsequent tokens from the data provided by the preceding tokens. Natural language modelling is by nature more complex than formal or programming language modelling where word usage can be precisely defined. Natural languages are not designed and do not have a formal specification. Even as natural languages have large numbers of terms and multiple ways to use them, these ambiguities are a challenge for machine learning.
Language models can be classified as either character-level or word-level. While character-level models have the advantage of requiring less memory (e.g., 26 letters in the English alphabet) when compared to word-level modes (171,476 words in use according to the Oxford Dictionary), character-level models are constrained by the vanishing gradients problem encountered by neural networks.
The vanishing gradient problem is machine learning issue that is present when training artificial neural networks with gradient-based learning methods and backpropagation. At each iteration, the neural network weights receives an update proportional to the partial derivative of the error function with respect to the current weight. In some cases, the gradient becomes vanishingly small to the point that weight effectively cannot change its value.
Recent research has shown that LSTM or QRNNs can be tuned to achieve state-of-the-art results on both character and word-level datasets using modern GPUs. While a recurrent neural network (RNN) exhibits temporal dynamic behavior due to the connections between nodes forming a directed graph along a temporal sequence, a quasi-recurrent neural network (QRNN) allows multiple parallel calculations.
This results in improved performance making QRNNs ideal for applications such as language modelling. LSTMs display a further improvement over QRNNs in language modelling. Since an LSTM stores information in memory, this allows RNNs to remember inputs over time, similar to the memory functions of a computer. LSTMs are therefore able to read, write, and delete the information it stores. In conclusion, using LSTMs and QRNNs can deliver state-of-the-art results with word level or character level models without relying on complex architectures.
Contact Fusion Professionals and let’s discuss how we can help explore and deploy these research-based language modelling applications as well as a range of other best in class data analytics technologies that can further enhance your company’s business intelligence.
Many organisations don’t realise it, but in our current environment Data has become the main differentiator in the market. Most…MORE INFORMATION
Professional services, one of the fastest growing sectors of the Australian economy, covers a broad group of companies and organizations…MORE INFORMATION
We experience an increasing polarisation in our political landscape with tribalism becoming a real issue. This is partially to be…MORE INFORMATION
Oracle’s introduction of the self-driving, self-securing, and self-repairing Autonomous Database draws upon its decades of expertise in automating databases and…MORE INFORMATION
In a recent blog post from Dataiku, the leading data science, machine learning, and AI platform, Lynn Heidmann explored ways…MORE INFORMATION
“With Great Power Comes Great Responsibility” One of the biggest ongoing responsibilities that comes after commissioning an Exadata appliance is…MORE INFORMATION
According to Constellation Research, a little more than half of traditional Fortune 500 companies have disappeared due to the lack…MORE INFORMATION
Fusion Professionals has signed a partnership agreement with Dataiku, one of the world’s leading machine learning platforms that moves companies…MORE INFORMATION
Statistical language models apply probability distributions to a sequence of words. These models are finding increasing use as natural language…MORE INFORMATION
Challenges The Company, one of Australia’s largest and fastest growing Telco companies had 2 primary SharePoint environments that had different…MORE INFORMATION
Containerization allows applications to run on any machine- anytime, anywhere so long as they are compatible. By virtualizing your OS,…MORE INFORMATION
So you’ve finally decided that the cloud is safer than corporate data centers and digital assets and you’ve chosen to…MORE INFORMATION
Building a system that houses your organisation’s data can be daunting, especially now that data acquisition is growing rapidly. The…MORE INFORMATION
Human-to-machine communication has not yet been perfected, but enterprises are already beginning to integrate this groundbreaking technology into their operations,…MORE INFORMATION
Fusion Professionals has signed a partnership agreement with MapR Technologies, provider of the industry’s leading data platform for AI and…MORE INFORMATION
“Big data is at the foundation of all of the megatrends that are happening today, from social to mobile to…MORE INFORMATION
In recent years data volumes have been increasing dramatically. This has created major challenges for traditional analytics platforms in terms…MORE INFORMATION
With the increasing volumes of data that can be cost effectively stored in the cloud, comes increasing responsibility. The current…MORE INFORMATION
With the advancement of technology and abundance of data your business receives on a daily basis, companies are now in…MORE INFORMATION