ARTIFICIAL INTELLIGENCE
AND
NATURAL LANGUAGE PROCESSING
ARTIFICIAL INTELLIGENCE
OVERVIEW
AI refers to ‘Artificial
Intelligence ‘ which means making machines capable of
performing intelligent tasks like human beings. AI performs automated tasks
using intelligence. The term Artificial Intelligence has two key components:
- Automation
- Intelligence
Stages
of Artificial Intelligence
Stage 1 – Machine Learning – It is a set of algorithms used by
intelligent systems to learn from experience.
Stage 2 – Machine Intelligence – These are the advanced set of algorithms
used by machines to learn from experience. Eg – Deep Neural Networks.
Artificial Intelligence technology is
currently at this stage.
Stage 3 – Machine Consciousness – It is self-learning from experience without
the need of external data.
TYPES OF ARTIFICIAL
INTELLIGENCE
ANI – Artificial Narrow Intelligence – It comprises of basic/role tasks such as
those performed by chatbots, personal assistants like SIRI by Apple and Alexa by Amazon.
AGI – Artificial General Intelligence – Artificial General Intelligence comprises
of human-level tasks such as performed by self-driving cars by Uber,
Autopilot by Tesla. It
involves continual learning by the machines.
ASI – Artificial Super Intelligence – Artificial Super Intelligence refers to
intelligence way smarter than humans.
WHAT MAKES SYSTEM AI
ENABLED
DIFFERENCE BETWEEN NLP,
AI, ML, DL & NN
AI or Artificial Intelligence – Building systems that can do intelligent
things.
NLP or Natural Language Processing – Building systems that can understand
language. It is a subset of Artificial Intelligence.
ML or Machine Learning – Building systems that can learn from experience.
It is also a subset of Artificial Intelligence.
NN or Neural Network – Biologically inspired network of Artificial
Neurons.
DL or Deep Learning – Building systems that use Deep Neural
Network on a large set of data. It is a subset of Machine Learning.
WHAT IS NATURAL LANGUAGE
PROCESSING?
Natural Language
Processing (NLP) is “ability of machines to understand and
interpret human language the way it is written or spoken”. The objective of NLP
is to make computer/machines as intelligent as human beings in understanding
language.
The ultimate goal of NLP is to
the fill the gap how the humans communicate(natural language) and what the
computer understands(machine language).
There are three different levels of linguistic
analysis done before performing NLP:
Syntax – What part of given text is grammatically
true.
Semantics – What is the meaning of given text?
Pragmatics – What is the purpose of the text?
NLP deal with different aspects of language such
as:
Phonology – It is systematic organization of sounds in
language.
Morphology – It is a study of words formation and their
relationship with each other.
Approaches of NLP for understanding semantic
analysis:
Distributional – It employs large-scale statistical tactics
of Machine Learning and Deep Learning.
Frame – Based – The sentences which are syntactically
different but semantically same are represented inside data structure (frame)
for the stereotyped situation.
Theoretical – This approach is based on the idea that
sentences refer to the real word (the sky is blue) and parts of the sentence
can be combined to represent whole meaning.
Interactive Learning – It involves pragmatic approach and user is
responsible for teaching the computer to learn the language step by step in an
interactive learning environment.
The true success of NLP lies in the
fact that humans deceive into believing that they are talking to humans instead
of computers.
Why Do We Need NLP?
With NLP, it is possible to perform
certain tasks like Automated Speech and Automated Text Writing in less time.
Due to the presence of large data (text) around, why not we use the computers
untiring willingness and ability to run several algorithms to perform tasks in
no time.
These tasks include other NLP
applications like Automatic Summarization (to generate summary of given text)
and Machine Translation (translation of one language into another)
PROCESS OF NLP
In
case the text is composed of speech, speech-to-text conversion is performed.
The mechanism of Natural Language Processing involves two processes: Natural
Language Understanding and Natural Language Generation
NATURAL LANGUAGE
UNDERSTANDING
NLU
or Natural Language Understanding tries to understand the meaning of given
text. The nature and structure of each word inside text must be understood for
NLU. For understanding structure, NLU tries to resolve following ambiguity
present in natural language:
- Lexical Ambiguity – Words have multiple meanings
- Syntactic Ambiguity – Sentence having multiple parse trees.
- Semantic Ambiguity – Sentence having multiple meanings
- Anaphoric Ambiguity – Phrase or word which is previously mentioned but has a
different meaning.
Next,
the meaning of each word is understood by using lexicons (vocabulary) and set
of grammatical rules. However, there are certain different words having similar
meaning (synonyms) and words having more than one meaning (polysemy).
NATURAL LANGUAGE GENERATION
It
is the process of automatically producing text from structured data in a
readable format with meaningful phrases and sentences. The problem of natural
language generation is hard to deal with. It is subset of NLP. Natural language
generation divided into three proposed stages:
- Text Planning – Ordering of the basic content in structured data is done.
- Sentence Planning – The sentences are combined from structured data to
represent the flow of information.
- Realization –
Grammatically correct sentences are produced finally to represent text.
DIFFERENCE BETWEEN NLP AND
TEXT MINING OR TEXT ANALYTICS
Natural
language processing is responsible for understanding meaning and structure of
given text. Text Mining or Text Analytics is a process of extracting
hidden information inside text data through pattern recognition.
Natural language processing is
used to understand the meaning (semantics) of given text data, while text
mining is used to understand structure (syntax) of given text data. As an
example – I found my wallet near the bank. The task of NLP is to understand in
the end that ‘bank’ refers to financial institute or ‘river bank’.
WHAT IS BIG DATA?
According to the Author Dr. Kirk Borne,
Principal Data Scientist, Big Data Definition is described as big data is
everything, quantified, and tracked. For More Details on Big Data, Please Read
– Ingestion
And Processing of Data For Big Data and IoT Solutions
NLP for Big Data is the Next Big Thing
Today around 80 % of total data is
available in the raw form. Big
Data comes from information stored in big organizations as well
as enterprises. Examples include information of employees, company purchase,
sale records, business transactions, the previous record of organizations,
social media etc. Though humans use language, which is ambiguous and
unstructured to be interpreted by computers, with the help of NLP, this huge
unstructured data can be harnessed for evolving patterns inside data to better
know the information contained in data. NLP can solve big problems of the business
world by using Big Data. Be it any business like retail, healthcare, business,
financial institutions.
WHAT IS CHATBOT?
Chatbots or Automated Intelligent Agents
- These are the computer programs you can talk
to through messaging apps, chat windows or through voice calling apps.
- These are intelligent digital assistants used
to resolve customer queries in a cost-effective, quick, and consistent
manner.
Importance of Chatbots
Chatbots are important to
understanding changes in digital customer care services provided and in many
routine queries that are most frequently inquired. Chatbots are useful in a
certain scenario when the customer service requests are specific in the area
and highly predictable, managing a high volume of similar requests, automated responses.
WORKING OF CHATBOT
Knowledge Base – It contains the database of information
that is used to equip chatbots with the information needed to respond to
queries of customers request.
Data Store – It contains interaction history of chatbot
with users.
NLP Layer – It translates users queries (free form)
into information that can be used for appropriate responses.
Application Layer – It is the application interface that is
used to interact with the user.
Chatbots learn each time they make
interaction with the user trying to match the user queries with the information
in the knowledge base using machine learning.
WHY DEEP LEARNING NEEDED
IN NLP
It uses a rule-based approach that
represents Words as ‘One-Hot’ encoded vectors. The traditional method focuses
on syntactic representation instead of semantic representation. Bag of words
classification model is unable to distinguish certain contexts.
Three Capabilities of Deep Learning
Expressibility – This quality describes how well a machine
can approximate universal functions.
Trainability – How well and quickly a DL system can learn
its problem.
Generalizability – How well the machine can perform
predictions on data that it has not been trained on.
There are of course other
capabilities that also need to be considered in Deep Learning such as
interpretability, modularity, transferability, latency, adversarial stability,
and security. But these are the main ones.
COMMON TASKS OF DEEP
LEARNING IN NLP
|
Deep Learning Algorithms
|
NLP Usage
|
|
Neural Network – NN (feed)
|
|
|
Recurrent Neural Networks -(RNN)
|
|
|
Recursive Neural Networks
|
|
|
Convolutional Neural Network -(CNN)
|
|
Difference Between Classical NLP & Deep
Learning NLP
NLP For Log Analysis and Log Mining
What is Log?
A collection of messages from
different network devices and hardware in time sequence represents a log. Logs
may be directed to files present on hard disks or can be sent over the network
as a stream of messages to log collector. Logs provide the process to maintain
and track the hardware performance, parameters tuning, emergency and recovery
of systems and optimization of applications and infrastructure. You may also
love to read – Understanding
Log Analytics, Log Mining and Anomaly Detection
What is Log Analysis?
Log analysis is the process of
extracting information from logs considering the different syntax and semantics
of messages in the log files and interpreting the context with application to
have a comparative analysis of log files coming from different sources for
Anomaly Detection and finding correlations.
What is Log Mining?
Log mining or log knowledge discovery
is the process of extracting patterns and correlations in logs to reveal
knowledge and predict anomaly detection if any inside log messages.
TECHNIQUES USED FOR LOG ANALYSIS AND LOG MINING
Different techniques used for
performing log analysis are described below
- Pattern recognition –
It is one such technique which involves comparing log messages with
messages stored in pattern book to filter out messages.
- Normalization –
Normalization of log messages is done to convert different messages
into the same format. This is done when different log messages having
different terminology but same interpretation is coming from different
sources like applications or operating systems.
- Classification & Tagging –
Classification & Tagging of different log messages involves ordering
of messages and tagging them with different keywords for later analysis.
- Artificial Ignorance –
It is a kind of technique using machine learning algorithms to discard
uninteresting log messages. It is also used to detect an anomaly in the
normal working of systems.
ROLE OF NLP
IN LOG ANALYSIS & LOG MINING
Natural Language processing
techniques are widely used in log analysis and log mining. The different
techniques such as tokenization, stemming, lemmatization, parsing etc are used
to convert log messages into structured form. Once logs are available in the
well-documented form, log analysis, and log mining is performed to extract
useful information and knowledge is discovered from information. The example in
case of error log caused due to server failure.
DIVING INTO
NATURAL LANGUAGE PROCESSING
Natural language processing is a
complex field and is the intersection of artificial intelligence, computational
linguistics, and computer science.
GETTING
STARTED WITH NLP
The user needs to import a file
containing text written. Then the user should perform the following steps for
natural language processing.
|
Technique
|
Example
|
Output
|
|
Sentence Segmentation
|
Mark met the president. He said:”Hi! What’s up -Alex?”
|
·
Sentence 1 – Mark met the
president.
·
Sentence 2 – He said: ”Hi! What’s
up – Alex?”
|
|
Tokenization
|
My phone tries to ‘charging’ from ‘discharging’ state.
|
·
[My] [phone] [tries] [to] [‘]
[charging] [‘][from] [‘][discharging] [‘] [state][.]
|
|
Stemming/Lemmatization
|
Drinking, Drank, Drunk
|
·
Drink
|
|
Part-of-Speech tagging
|
If you build it he will come.
|
·
IN – prepositions and subordinating
conjunctions.
·
PRP – Personal Pronoun
·
VBP – Verb Noun 3rd person singular
present form.
·
PRP- Personal pronoun
·
MD – Modal Verbs
·
VB – Verb base form
|
|
Parsing
|
Mark and Joe went into a bar.
|
·
(S(NP(NP Mark) and (NP(Joe))
·
(VP(went (PP into (NP a bar))))
|
|
Named Entity Recognition
|
Let’s meet Alice at 6 am in India.
|
·
Let’s meet Alice at 6 am in India
·
Person Time Location
|
|
Coreference resolution
|
Mark went into the mall. He thought it was a shopping mall.
|
·
Mark went into the mall. He thought
it was a shopping mall.
|
- Sentence segmentation –
It identifies sentence boundaries in the given text i.e where one sentence
ends and where another sentence begins. Sentences are often marked ended
with punctuation mark ‘.’
- Tokenization –
It identifies different words, numbers, and other punctuation symbols.
- Stemming – It strips the ending of words like
‘eating’ is reduced to ‘eat.’
- Part of speech (POS) tagging –
It assigns each word in a sentence its respective part-of-speech tag such
as designating word as noun or adverb.
- Parsing – It involves dividing given text into
different categories. To answer a question like this part of sentence
modify another part of the sentence.
- Named Entity Recognition –
It identifies entities such as persons, location and time within the
documents.
- Co-Reference resolution –
It is about defining the relationship of given word in a sentence with a
previous and the next sentence.
FURTHER KEY
APPLICATION AREAS OF NLP
Apart from application in Big Data,
Log Mining, and Log Analysis it has other major application areas. Although the
term ‘NLP’ is not as popular as ‘big data’ ‘machine learning’ but we are using
NLP every day.
Automatic summarizer – Given the input text, the task is to write
a summary of text discarding irrelevant points.
Sentimental analysis – It is done on the given text to predict the
subject of the text eg: whether the text conveys judgment, opinion or reviews
etc.
Text classification – It is performed to categorize different
journals, news stories according to their domain. Multi-document classification
is also possible. A popular example of text classification is spam detection in
emails.
Based on the style of the writing in
the journal, its attribute can be used to detect its author name.
Information Extraction – Information extraction is something which
proposes email program to automatically add events to the calendar.
Thanks …
Avijit
Kumar Roy | avijitkumarroy@gmail.com |
+91 98369 76920 | www.avijit.zumvu.com










Comments
Post a Comment