å°Šę•¬ēš„ å¾®äæ”걇ēŽ‡ļ¼š1円 ā‰ˆ 0.046166 元 ę”Æä»˜å®ę±‡ēŽ‡ļ¼š1円 ā‰ˆ 0.046257元 [退å‡ŗē™»å½•]
SlideShare a Scribd company logo
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
Ā© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 366
Text Segmentation for Online Subjective Examination Using Machine
Learning
Shahid Khan1, Rakshanda Chavan2 , Diksha Singh3, Tina Sajwan4
1,2,3,4 Modern Education Societyā€™s College of Engineering, Pune-411001
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - This paper focuses on text segmentation for
natural language using k-Nearest Neighbour (K-NN)
classiļ¬er , which is a type of instance-based learning, or lazy
learning, where the function is only approximated locally
and all computation is deferred until classiļ¬cation. The Text
segmentation divides written text into meaningful units,
which is used by humans when reading text, and artiļ¬cial
processes implemented in computers which are subject to
natural language processing. K-NN computes the similarity
measure among attributes to determine similarity between
feature vectors after which K-NN is modiļ¬ed based on the
similarity measure, this version is applied into the text
segmentation task. The goal of this paper is to implement
natural language processing using text segmentation which
provides the beneļ¬ts.
Key Words: K-NN, text segmentation, feature similarity,
NLP
1.INTRODUCTION
The text segmentation is deļ¬ned as process of segmenting
automatically a large text into many parts based on its
topic or content. The information retrieval (IR) systems
tend to retrieve long texts which contain more than one
topic, as very high relevant texts to the given query, so the
long texts need to be segmented into text partitions topic
by topic. The task of text segmentation is to partition the
text into sentences and paragraphs and judge whether the
topic boundary is put or not between two adjacent
sentences or paragraph. In this task, the text is given as the
input and segmented into paragraphs, a list of pairs of
adjacent paragraphs is generated, and each pair is judged
whether we put the topic boundary between them, or not.
The task is interpreted into a binary classiļ¬cation where
each pair of paragraphs is classiļ¬ed into separation or
non-separation. The task may be interpreted into the
binary classiļ¬cation where each sentence or paragraph
pair into the transition to the different topic or the
continuation of the identical topic.
Some issues are caused by encoding texts into numerical
vectors and computing their similarities based on only
attribute values. This problem causes very high costs for
processing each numerical vector representing a
document in terms of time and system resources. Much
more training examples are required proportionally to the
dimension for avoiding overļ¬tting. The second problem is
sparse distribution where each numerical vector has zero
values dominantly.
Let us mention what we propose in this research as some
agenda. In this research, we assume that words are given
as features of numerical vectors in encoding texts, and
they have their semantic relations with others. Based on
the assumption, we deļ¬ne the similarity measure for
computing the similarity between feature vectors,
considering both feature values and features. We modify
the KNN into the version where both the feature similarity
and the feature value similarity are used, and apply it to
the classiļ¬cation task mapped from the text segmentation.
As beneļ¬ts from this research, we expect its more
tolerance to the sparse distributions and the potential
avoidance of the huge dimensionality.
Let us mention what is expected from this research as
beneļ¬ts by implementing the above ideas. We may cut
down the dimensionality in encoding texts into numerical
vectors, potentially. The information loss in computing the
similarity between texts may be reduced by reļ¬‚ecting the
similarities among the features.
We present some beneļ¬ts which are expected from this
research. By representing the texts into alternative one to
the numerical vectors, we may escape from the two main
problems in doing so. The proposed approach becomes
less sensitive to the sparse distribution of numerical
vectors, because the similarity among features is captured
as well as among feature values.
2. RELATED WORK
Let us survey the previous cases of encoding texts into
structured forms for using the machine learning
algorithms for text mining tasks. The three main problems,
huge dimensionality, sparse distribution, and poor
transparency, have existed inherently in encoding them
into numerical vectors. In previous works, various
schemes of pre-processing texts have been proposed, in
order to solve the problems.
In paper [1], it is given that text segmentation refers to the
process of segmenting an article into its several parts
based on its content. Because in the information retrieval
systems, a long text tends to be retrieved most frequently
by overestimation of its relevancy to a query, we need to
segment it into its several parts, in order to avoid the
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
Ā© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 367
problem. In this task, the text is given as the input and
segmented into paragraphs, a list of pairs of adjacent
paragraphs is generated, and each pair is judged whether
we put the topic boundary between them, or not.
In paper [2], The task of text segmentation is to partition
the text into sentences and paragraphs and judge whether
the topic boundary is put or not between two adjacent
sentences or paragraph. The task may be interpreted into
the binary classiļ¬cation where each sentence or
paragraph pair into the transition to the different topic or
the continuation of the identical topic. Segmentation of
speech texts into sentences or paragraphs may be
considered but covered in the next research. In the text
categorization, the sample texts may span over various
domains, whereas in the text segmentation, the sample
paragraphs should be within a domain. Therefore,
although the text segmentation belongs to the
classiļ¬cation task, it should be distinguished from the
topic based text categorization. The text segmentation is
mapped into a binary classiļ¬cation.
In paper [3], the application of the back propagation to the
judgment of keywords is validated restrictedly. The
deļ¬nition of the back propagation to the judgment of
keywords may be considered in various ways. The
Information systems dealing with documents, such as
Knowledge Management (KM), Information Retrieval (IR)
and Digital Library (DL) systems require the storage of
documents and structured data, called the document
surrogate, associated with documents. Documents are
written in natural language and cannot be processed
directly by computers. A typical document surrogate,
which is converted from the natural language document
by computer, contains indices of the document and
includes main words reļ¬‚ecting the contents. Indexing
deļ¬nes the process of converting a document into a list of
words included in it. This paper proposed the application
of back propagation and consideration of more factors
with the addition to TF (Term Frequency) and IDF
(Inverse Document Frequency).
Paper [4], states that text categorization is the process of
assigning one or some among predeļ¬ned categories to
each document. The task belongs to pattern classiļ¬cation
where texts or documents are given as patterns. Note that
almost information in any system is given as textual
formats dominantly over numerical one. For managing
efļ¬ciently the kind of information given as the textual
format, techniques of text categorization are necessary;
text categorization became a very interesting research
topic in both academic and industrial worlds. In this
version of the proposed text categorization system, the
number of entries of tables is ļ¬xed constantly. The
proposed one is called static index based approach.
However, the optimal number of entries is very dependent
on the given document or corpus. The size of each table
should be optimized in terms of two factors: reliability and
efļ¬ciency.
In paper [5], authors tried to understand the automated
categorization (or classiļ¬cation) of texts into predeļ¬ned
categories has witnessed a booming interest in the last 10
years, due to the increased availability of documents in
digital form and the ensuing need to organize them. It is
important to bear in mind that the considerations above
are not absolute statements (if there may be any) on the
comparative effectiveness of these TC methods. One of the
reasons is that a particular applicative context may exhibit
very different characteristics from the ones to be found in
Reuters, and different classiļ¬ers may respond differently
to these characteristics. An experimental study by
Joachims [1998] involving support vector machines, k-NN,
decision trees, Rocchio, and Naive Bayes, showed all these
classiļ¬ers to have similar effectiveness on categories with
300 positive training examples each. The fact that this
experiment involved the methods which have scored best
(support vector machines, k-NN) and worst (Rocchio and
Naive Bayes).Most popular approach to TC, at least in the
operational (i.e., real world applications) community, was
a knowledge engineering (KE).
In paper [6], the authors have studied that text clustering
refers to the process of segmenting a particular group of
documents into sub groups each of which contains content
based similar documents. A collection or group of
documents is given as the input of the task. Several smaller
groups of content-based similar documents are generated
from the task as its output. Although there are many
heuristic approaches to the task, unsupervised learning
algorithms have been used as state of the art approaches
to it. The process of encoding documents into numerical
vectors for using traditional unsupervised learning
algorithms for text clustering causes the two main
problems. The ļ¬rst problem is huge dimensionality where
documents must be encoded into very large dimensional
numerical vectors for preventing information loss. In
general, documents must be encoded at least into several
hundreds dimensional numerical vectors in previous
literatures. This problem causes very expensive cost for
processing each numerical vector representing a
document in terms of time and system resources.
Furthermore, much more training examples are required
proportionally to the dimension for avoiding
overļ¬tting.The second problem is sparse distribution
where each numerical vector has zero values dominantly.
In other words, more than 90 degree 0 of its elements are
zero values in each numerical vector. This phenomenon
degrades the discrimination among numerical vectors.
This causes poor performance of text categorization or
text clustering. In order to improve performance of both
tasks, the two problems should be solved.
3. PROPOSED SYSTEM
KNN Classiļ¬er:
This section tells about the KNN classiļ¬er which is an
algorithm used for text segmentation. It keeps the record
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
Ā© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 368
of all the previous cases and another unknown case is
been classiļ¬ed. It is a type of supervised learning. The
unknown case is been classiļ¬ed by the maximum votes of
its K nearest neighbours. It is a kind of Machine Learning
algorithm and also one of the simplest algorithm used for
classiļ¬cation. It considers the similarity between the
attributes of the answers written by the user and then
computes the similarities between the features of the
answer and specimen answers. In this research, we
encode sentence pairs or paragraph pairs into string
vectors, and apply the string vector based version of KNN
to the classiļ¬cation task mapped from the text
segmentation
NLP:
This section is concerned with Natural Language
Processing which is a ļ¬eld of AI(Artiļ¬cial Intelligence). It
is about the co-operation between the computer and the
Natural Language used by humans. NLP is helpful in
solving many problems like machine translation and text
segmentation.
Text Segmentation:
This section is concerned about Text segmentation which
is the process where the text which is been written is
divided into small parts. The term applies both to mental
processes used by humans when reading text, and to
artiļ¬cial processes implemented in computers, which are
the subject of natural language processing. It is very
helpful in assisting computers so that it is possible for the
computers to do artiļ¬cial things. It is a precursor Natural
Language Processing. Text Segmentation recognizes the
boundaries in between the words.
Data Store:
This section tells us about the role of data store in the
process. A data store is a repository for storing collections
of data, such as database. A data store is basically a
connection to the repository of data, whether the data is
stored in a single database or in one more different ļ¬les.
The data store can be used to gain data or you can export
the data from results and then store it in the data store, or
both. The data collected from the users is stored in the
data store. For the processing the data stored in the data
store is processed and stored back into the data store for
the users to retrieve their processed data whenever he
wants. Hence data store plays a major role in the entire
process. For the data to be stored in the data store it need
not compulsorily be arranged in some relational format.
Fig -1: Architecture Diagram
4. CONCLUSIONS
An examination system is developed based on the web.
This paper describes the principle of the system, presents
the main functions of the system, analyzes the auto-
generating test paper algorithm, and discusses the
security of the system. With the help of the algorithm we
can conduct online subjective exams anywhere and
everywhere.
It saves time as it allows number of students to give the
exam at a time and displays the results as the test gets
over, so no need to wait for the result. It is automatically
generated by the server. Staff has a privilege to create,
modify and delete the test papers and its particular
questions. Student can register, login and give the test
with his speciļ¬c id, and can see the results as well.
ACKNOWLEDGEMENT
We thank our guide Prof. A. D. Dhawale for his guidance
and support.
REFERENCES
1) Taeho Jo, ā€œUsing K Nearest Neighbors for Text
Segmentation with Feature Similarityā€,
International Conference on Communication,
Control, Computing and Electronics Engineering
(ICCCCEE), 2017.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072
Ā© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 369
2) Taeho Jo, ā€œContent based Segmentation of Texts
using Table based KNNā€, IKE, 2017.
3) Taeho Jo, Malrey Lee , and Thomas M Gatton,
ā€œKeyword Extraction from Documents Using a
Neural Network Modelā€, IEEE, 2016.
4) T. Jo, ā€œNTC (Neural Text Categorizer): Neural
Network for Text Categorizationā€, pp83-96,
International Journal of Information Studies, Vol 2,
No 2, 2010.
5) T. Jo, ā€œNormalized Table Matching Algorithm as
Approach to Text Categorizationā€, pp839-849, Soft
Computing, Vol 19, No 4, 2015.
6) T. Jo, ā€œSingle Pass Algorithm for Text Clustering by
Encoding Documents into Tablesā€, pp1749-1757,
Journal of Korea Multimedia Society, Vol 11, No 12,
2008.
7) T. Jo and D. Cho, ā€œIndex based Approach for Text
Categorizationā€, International Journal of
Mathematics and Computers in Simulation, Vol 2,
No 1, 2008.
8) H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini,
and C. Watkins, ā€œText Classiļ¬cation with String
Kernelsā€, pp419-444, Journal of Machine Learning
Research, Vol 2, No 2, 2002.
9) F. Sebastiani, ā€œMachine Learning in Automated Text
Categorizationā€, pp1-47, ACM Computing Survey,
Vol 34, No 1, 2002
10) T. Jo, ā€œRepresentation of Texts into String Vectors
for Text Categorizationā€, pp110-127, Journal of
Computing Science and Engineering, Vol 4, No 2,
2010.

More Related Content

What's hot

A Comparative Study of Centroid-Based and NaĆÆve Bayes Classifiers for Documen...
A Comparative Study of Centroid-Based and NaĆÆve Bayes Classifiers for Documen...A Comparative Study of Centroid-Based and NaĆÆve Bayes Classifiers for Documen...
A Comparative Study of Centroid-Based and NaĆÆve Bayes Classifiers for Documen...
IJERA Editor
Ā 
G04124041046
G04124041046G04124041046
G04124041046
IOSR-JEN
Ā 
An efficient-classification-model-for-unstructured-text-document
An efficient-classification-model-for-unstructured-text-documentAn efficient-classification-model-for-unstructured-text-document
An efficient-classification-model-for-unstructured-text-document
SaleihGero
Ā 
Bl24409420
Bl24409420Bl24409420
Bl24409420
IJERA Editor
Ā 
Relevance feature discovery for text mining
Relevance feature discovery for text miningRelevance feature discovery for text mining
Relevance feature discovery for text mining
redpel dot com
Ā 
An Evaluation of Preprocessing Techniques for Text Classification
An Evaluation of Preprocessing Techniques for Text ClassificationAn Evaluation of Preprocessing Techniques for Text Classification
An Evaluation of Preprocessing Techniques for Text Classification
IJCSIS Research Publications
Ā 
Query Answering Approach Based on Document Summarization
Query Answering Approach Based on Document SummarizationQuery Answering Approach Based on Document Summarization
Query Answering Approach Based on Document Summarization
IJMER
Ā 
IRJET- Text Document Clustering using K-Means Algorithm
IRJET-  	  Text Document Clustering using K-Means Algorithm IRJET-  	  Text Document Clustering using K-Means Algorithm
IRJET- Text Document Clustering using K-Means Algorithm
IRJET Journal
Ā 
IRJET-Semantic Similarity Between Sentences
IRJET-Semantic Similarity Between SentencesIRJET-Semantic Similarity Between Sentences
IRJET-Semantic Similarity Between Sentences
IRJET Journal
Ā 
Hc3612711275
Hc3612711275Hc3612711275
Hc3612711275
IJERA Editor
Ā 
Legal Document
Legal DocumentLegal Document
Legal Document
legal4
Ā 
Text Document categorization using support vector machine
Text Document categorization using support vector machineText Document categorization using support vector machine
Text Document categorization using support vector machine
IRJET Journal
Ā 
Suitability of naĆÆve bayesian methods for paragraph level text classification...
Suitability of naĆÆve bayesian methods for paragraph level text classification...Suitability of naĆÆve bayesian methods for paragraph level text classification...
Suitability of naĆÆve bayesian methods for paragraph level text classification...
ijaia
Ā 
A CLUSTERING TECHNIQUE FOR EMAIL CONTENT MINING
A CLUSTERING TECHNIQUE FOR EMAIL CONTENT MININGA CLUSTERING TECHNIQUE FOR EMAIL CONTENT MINING
A CLUSTERING TECHNIQUE FOR EMAIL CONTENT MINING
ijcsit
Ā 
8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network
INFOGAIN PUBLICATION
Ā 

What's hot (15)

A Comparative Study of Centroid-Based and NaĆÆve Bayes Classifiers for Documen...
A Comparative Study of Centroid-Based and NaĆÆve Bayes Classifiers for Documen...A Comparative Study of Centroid-Based and NaĆÆve Bayes Classifiers for Documen...
A Comparative Study of Centroid-Based and NaĆÆve Bayes Classifiers for Documen...
Ā 
G04124041046
G04124041046G04124041046
G04124041046
Ā 
An efficient-classification-model-for-unstructured-text-document
An efficient-classification-model-for-unstructured-text-documentAn efficient-classification-model-for-unstructured-text-document
An efficient-classification-model-for-unstructured-text-document
Ā 
Bl24409420
Bl24409420Bl24409420
Bl24409420
Ā 
Relevance feature discovery for text mining
Relevance feature discovery for text miningRelevance feature discovery for text mining
Relevance feature discovery for text mining
Ā 
An Evaluation of Preprocessing Techniques for Text Classification
An Evaluation of Preprocessing Techniques for Text ClassificationAn Evaluation of Preprocessing Techniques for Text Classification
An Evaluation of Preprocessing Techniques for Text Classification
Ā 
Query Answering Approach Based on Document Summarization
Query Answering Approach Based on Document SummarizationQuery Answering Approach Based on Document Summarization
Query Answering Approach Based on Document Summarization
Ā 
IRJET- Text Document Clustering using K-Means Algorithm
IRJET-  	  Text Document Clustering using K-Means Algorithm IRJET-  	  Text Document Clustering using K-Means Algorithm
IRJET- Text Document Clustering using K-Means Algorithm
Ā 
IRJET-Semantic Similarity Between Sentences
IRJET-Semantic Similarity Between SentencesIRJET-Semantic Similarity Between Sentences
IRJET-Semantic Similarity Between Sentences
Ā 
Hc3612711275
Hc3612711275Hc3612711275
Hc3612711275
Ā 
Legal Document
Legal DocumentLegal Document
Legal Document
Ā 
Text Document categorization using support vector machine
Text Document categorization using support vector machineText Document categorization using support vector machine
Text Document categorization using support vector machine
Ā 
Suitability of naĆÆve bayesian methods for paragraph level text classification...
Suitability of naĆÆve bayesian methods for paragraph level text classification...Suitability of naĆÆve bayesian methods for paragraph level text classification...
Suitability of naĆÆve bayesian methods for paragraph level text classification...
Ā 
A CLUSTERING TECHNIQUE FOR EMAIL CONTENT MINING
A CLUSTERING TECHNIQUE FOR EMAIL CONTENT MININGA CLUSTERING TECHNIQUE FOR EMAIL CONTENT MINING
A CLUSTERING TECHNIQUE FOR EMAIL CONTENT MINING
Ā 
8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network8 efficient multi-document summary generation using neural network
8 efficient multi-document summary generation using neural network
Ā 

Similar to Text Segmentation for Online Subjective Examination using Machine Learning

IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET Journal
Ā 
An in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPAn in-depth review on News Classification through NLP
An in-depth review on News Classification through NLP
IRJET Journal
Ā 
Reviews on swarm intelligence algorithms for text document clustering
Reviews on swarm intelligence algorithms for text document clusteringReviews on swarm intelligence algorithms for text document clustering
Reviews on swarm intelligence algorithms for text document clustering
IRJET Journal
Ā 
Semantic Based Document Clustering Using Lexical Chains
Semantic Based Document Clustering Using Lexical ChainsSemantic Based Document Clustering Using Lexical Chains
Semantic Based Document Clustering Using Lexical Chains
IRJET Journal
Ā 
C017321319
C017321319C017321319
C017321319
IOSR Journals
Ā 
Text Document Classification System
Text Document Classification SystemText Document Classification System
Text Document Classification System
IRJET Journal
Ā 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
IRJET Journal
Ā 
Feature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsFeature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documents
IJECEIAES
Ā 
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET- 	  Automated Document Summarization and Classification using Deep Lear...IRJET- 	  Automated Document Summarization and Classification using Deep Lear...
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET Journal
Ā 
Exploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsExploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining Applications
IRJET Journal
Ā 
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET-  	  Review on Information Retrieval for Desktop Search EngineIRJET-  	  Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET Journal
Ā 
Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach
IJCSIS Research Publications
Ā 
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft ComputingIRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET Journal
Ā 
Survey of Machine Learning Techniques in Textual Document Classification
Survey of Machine Learning Techniques in Textual Document ClassificationSurvey of Machine Learning Techniques in Textual Document Classification
Survey of Machine Learning Techniques in Textual Document Classification
IOSR Journals
Ā 
Converting UML Class Diagrams into Temporal Object Relational DataBase
Converting UML Class Diagrams into Temporal Object Relational DataBase Converting UML Class Diagrams into Temporal Object Relational DataBase
Converting UML Class Diagrams into Temporal Object Relational DataBase
IJECEIAES
Ā 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET Journal
Ā 
Context Driven Technique for Document Classification
Context Driven Technique for Document ClassificationContext Driven Technique for Document Classification
Context Driven Technique for Document Classification
IDES Editor
Ā 
Meta documents and query extension to enhance information retrieval process
Meta documents and query extension to enhance information retrieval processMeta documents and query extension to enhance information retrieval process
Meta documents and query extension to enhance information retrieval process
eSAT Journals
Ā 
Group4 doc
Group4 docGroup4 doc
Group4 doc
firati
Ā 
Machine learning for text document classification-efficient classification ap...
Machine learning for text document classification-efficient classification ap...Machine learning for text document classification-efficient classification ap...
Machine learning for text document classification-efficient classification ap...
IAESIJAI
Ā 

Similar to Text Segmentation for Online Subjective Examination using Machine Learning (20)

IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
IRJET- Diverse Approaches for Document Clustering in Product Development Anal...
Ā 
An in-depth review on News Classification through NLP
An in-depth review on News Classification through NLPAn in-depth review on News Classification through NLP
An in-depth review on News Classification through NLP
Ā 
Reviews on swarm intelligence algorithms for text document clustering
Reviews on swarm intelligence algorithms for text document clusteringReviews on swarm intelligence algorithms for text document clustering
Reviews on swarm intelligence algorithms for text document clustering
Ā 
Semantic Based Document Clustering Using Lexical Chains
Semantic Based Document Clustering Using Lexical ChainsSemantic Based Document Clustering Using Lexical Chains
Semantic Based Document Clustering Using Lexical Chains
Ā 
C017321319
C017321319C017321319
C017321319
Ā 
Text Document Classification System
Text Document Classification SystemText Document Classification System
Text Document Classification System
Ā 
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering SystemKnowledge Graph and Similarity Based Retrieval Method for Query Answering System
Knowledge Graph and Similarity Based Retrieval Method for Query Answering System
Ā 
Feature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documentsFeature selection, optimization and clustering strategies of text documents
Feature selection, optimization and clustering strategies of text documents
Ā 
IRJET- Automated Document Summarization and Classification using Deep Lear...
IRJET- 	  Automated Document Summarization and Classification using Deep Lear...IRJET- 	  Automated Document Summarization and Classification using Deep Lear...
IRJET- Automated Document Summarization and Classification using Deep Lear...
Ā 
Exploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining ApplicationsExploiting Wikipedia and Twitter for Text Mining Applications
Exploiting Wikipedia and Twitter for Text Mining Applications
Ā 
IRJET- Review on Information Retrieval for Desktop Search Engine
IRJET-  	  Review on Information Retrieval for Desktop Search EngineIRJET-  	  Review on Information Retrieval for Desktop Search Engine
IRJET- Review on Information Retrieval for Desktop Search Engine
Ā 
Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach Improved Text Mining for Bulk Data Using Deep Learning Approach
Improved Text Mining for Bulk Data Using Deep Learning Approach
Ā 
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft ComputingIRJET- Semantic based Automatic Text Summarization based on Soft Computing
IRJET- Semantic based Automatic Text Summarization based on Soft Computing
Ā 
Survey of Machine Learning Techniques in Textual Document Classification
Survey of Machine Learning Techniques in Textual Document ClassificationSurvey of Machine Learning Techniques in Textual Document Classification
Survey of Machine Learning Techniques in Textual Document Classification
Ā 
Converting UML Class Diagrams into Temporal Object Relational DataBase
Converting UML Class Diagrams into Temporal Object Relational DataBase Converting UML Class Diagrams into Temporal Object Relational DataBase
Converting UML Class Diagrams into Temporal Object Relational DataBase
Ā 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
Ā 
Context Driven Technique for Document Classification
Context Driven Technique for Document ClassificationContext Driven Technique for Document Classification
Context Driven Technique for Document Classification
Ā 
Meta documents and query extension to enhance information retrieval process
Meta documents and query extension to enhance information retrieval processMeta documents and query extension to enhance information retrieval process
Meta documents and query extension to enhance information retrieval process
Ā 
Group4 doc
Group4 docGroup4 doc
Group4 doc
Ā 
Machine learning for text document classification-efficient classification ap...
Machine learning for text document classification-efficient classification ap...Machine learning for text document classification-efficient classification ap...
Machine learning for text document classification-efficient classification ap...
Ā 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
Ā 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
Ā 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
IRJET Journal
Ā 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
IRJET Journal
Ā 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
IRJET Journal
Ā 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
IRJET Journal
Ā 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
IRJET Journal
Ā 
A Review of ā€œSeismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of ā€œSeismic Response of RC Structures Having Plan and Vertical Irreg...A Review of ā€œSeismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of ā€œSeismic Response of RC Structures Having Plan and Vertical Irreg...
IRJET Journal
Ā 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
IRJET Journal
Ā 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
IRJET Journal
Ā 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
IRJET Journal
Ā 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
IRJET Journal
Ā 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
IRJET Journal
Ā 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
IRJET Journal
Ā 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
IRJET Journal
Ā 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
IRJET Journal
Ā 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
IRJET Journal
Ā 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
IRJET Journal
Ā 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
IRJET Journal
Ā 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
IRJET Journal
Ā 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
Ā 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
Ā 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
Ā 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
Ā 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
Ā 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Ā 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Ā 
A Review of ā€œSeismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of ā€œSeismic Response of RC Structures Having Plan and Vertical Irreg...A Review of ā€œSeismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of ā€œSeismic Response of RC Structures Having Plan and Vertical Irreg...
Ā 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
Ā 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Ā 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
Ā 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
Ā 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
Ā 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
Ā 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
Ā 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
Ā 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
Ā 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Ā 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Ā 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Ā 

Recently uploaded

Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
paraasingh12 #V08
Ā 
Call Girls In Lucknow šŸ”„ +91-7014168258šŸ”„High Profile Call Girl Lucknow
Call Girls In Lucknow šŸ”„ +91-7014168258šŸ”„High Profile Call Girl LucknowCall Girls In Lucknow šŸ”„ +91-7014168258šŸ”„High Profile Call Girl Lucknow
Call Girls In Lucknow šŸ”„ +91-7014168258šŸ”„High Profile Call Girl Lucknow
yogita singh$A17
Ā 
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdfFUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
EMERSON EDUARDO RODRIGUES
Ā 
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASICINTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
GOKULKANNANMMECLECTC
Ā 
ā£Independent Call Girls Chennai šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chennai E...
ā£Independent Call Girls Chennai šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chennai E...ā£Independent Call Girls Chennai šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chennai E...
ā£Independent Call Girls Chennai šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chennai E...
nainakaoornoida
Ā 
Literature review for prompt engineering of ChatGPT.pptx
Literature review for prompt engineering of ChatGPT.pptxLiterature review for prompt engineering of ChatGPT.pptx
Literature review for prompt engineering of ChatGPT.pptx
LokerXu2
Ā 
Particle Swarm Optimizationā€“Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimizationā€“Long Short-Term Memory based Channel Estimation w...Particle Swarm Optimizationā€“Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimizationā€“Long Short-Term Memory based Channel Estimation w...
IJCNCJournal
Ā 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
drshikhapandey2022
Ā 
šŸ”„ Hyderabad Call Girls Ā šŸ‘‰ 9352988975 šŸ‘« High Profile Call Girls Whatsapp Numbe...
šŸ”„ Hyderabad Call Girls Ā šŸ‘‰ 9352988975 šŸ‘« High Profile Call Girls Whatsapp Numbe...šŸ”„ Hyderabad Call Girls Ā šŸ‘‰ 9352988975 šŸ‘« High Profile Call Girls Whatsapp Numbe...
šŸ”„ Hyderabad Call Girls Ā šŸ‘‰ 9352988975 šŸ‘« High Profile Call Girls Whatsapp Numbe...
aarusi sexy model
Ā 
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
DharmaBanothu
Ā 
Mahipalpur Call Girls Delhi šŸ”„ 9711199012 ā„- Pick Your Dream Call Girls with 1...
Mahipalpur Call Girls Delhi šŸ”„ 9711199012 ā„- Pick Your Dream Call Girls with 1...Mahipalpur Call Girls Delhi šŸ”„ 9711199012 ā„- Pick Your Dream Call Girls with 1...
Mahipalpur Call Girls Delhi šŸ”„ 9711199012 ā„- Pick Your Dream Call Girls with 1...
simrangupta87541
Ā 
CSP_Study - Notes (Paul McNeill) 2017.pdf
CSP_Study - Notes (Paul McNeill) 2017.pdfCSP_Study - Notes (Paul McNeill) 2017.pdf
CSP_Study - Notes (Paul McNeill) 2017.pdf
Ismail Sultan
Ā 
Kandivali Call Girls ā˜‘ +91-9967584737 ā˜‘ Available Hot Girls Aunty Book Now
Kandivali Call Girls ā˜‘ +91-9967584737 ā˜‘ Available Hot Girls Aunty Book NowKandivali Call Girls ā˜‘ +91-9967584737 ā˜‘ Available Hot Girls Aunty Book Now
Kandivali Call Girls ā˜‘ +91-9967584737 ā˜‘ Available Hot Girls Aunty Book Now
SONALI Batra $A12
Ā 
Online train ticket booking system project.pdf
Online train ticket booking system project.pdfOnline train ticket booking system project.pdf
Online train ticket booking system project.pdf
Kamal Acharya
Ā 
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdfAsymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
felixwold
Ā 
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdfSELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
Pallavi Sharma
Ā 
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Dr.Costas Sachpazis
Ā 
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptxMODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
NaveenNaveen726446
Ā 
College Call Girls Kolkata šŸ”„ 7014168258 šŸ”„ Real Fun With Sexual Girl Available...
College Call Girls Kolkata šŸ”„ 7014168258 šŸ”„ Real Fun With Sexual Girl Available...College Call Girls Kolkata šŸ”„ 7014168258 šŸ”„ Real Fun With Sexual Girl Available...
College Call Girls Kolkata šŸ”„ 7014168258 šŸ”„ Real Fun With Sexual Girl Available...
Ak47
Ā 
šŸ”„Young College Call Girls Chandigarh šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chan...
šŸ”„Young College Call Girls Chandigarh šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chan...šŸ”„Young College Call Girls Chandigarh šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chan...
šŸ”„Young College Call Girls Chandigarh šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chan...
sonamrawat5631
Ā 

Recently uploaded (20)

Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Ā 
Call Girls In Lucknow šŸ”„ +91-7014168258šŸ”„High Profile Call Girl Lucknow
Call Girls In Lucknow šŸ”„ +91-7014168258šŸ”„High Profile Call Girl LucknowCall Girls In Lucknow šŸ”„ +91-7014168258šŸ”„High Profile Call Girl Lucknow
Call Girls In Lucknow šŸ”„ +91-7014168258šŸ”„High Profile Call Girl Lucknow
Ā 
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdfFUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
FUNDAMENTALS OF MECHANICAL ENGINEERING.pdf
Ā 
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASICINTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
INTRODUCTION TO ARTIFICIAL INTELLIGENCE BASIC
Ā 
ā£Independent Call Girls Chennai šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chennai E...
ā£Independent Call Girls Chennai šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chennai E...ā£Independent Call Girls Chennai šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chennai E...
ā£Independent Call Girls Chennai šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chennai E...
Ā 
Literature review for prompt engineering of ChatGPT.pptx
Literature review for prompt engineering of ChatGPT.pptxLiterature review for prompt engineering of ChatGPT.pptx
Literature review for prompt engineering of ChatGPT.pptx
Ā 
Particle Swarm Optimizationā€“Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimizationā€“Long Short-Term Memory based Channel Estimation w...Particle Swarm Optimizationā€“Long Short-Term Memory based Channel Estimation w...
Particle Swarm Optimizationā€“Long Short-Term Memory based Channel Estimation w...
Ā 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
Ā 
šŸ”„ Hyderabad Call Girls Ā šŸ‘‰ 9352988975 šŸ‘« High Profile Call Girls Whatsapp Numbe...
šŸ”„ Hyderabad Call Girls Ā šŸ‘‰ 9352988975 šŸ‘« High Profile Call Girls Whatsapp Numbe...šŸ”„ Hyderabad Call Girls Ā šŸ‘‰ 9352988975 šŸ‘« High Profile Call Girls Whatsapp Numbe...
šŸ”„ Hyderabad Call Girls Ā šŸ‘‰ 9352988975 šŸ‘« High Profile Call Girls Whatsapp Numbe...
Ā 
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
An In-Depth Exploration of Natural Language Processing: Evolution, Applicatio...
Ā 
Mahipalpur Call Girls Delhi šŸ”„ 9711199012 ā„- Pick Your Dream Call Girls with 1...
Mahipalpur Call Girls Delhi šŸ”„ 9711199012 ā„- Pick Your Dream Call Girls with 1...Mahipalpur Call Girls Delhi šŸ”„ 9711199012 ā„- Pick Your Dream Call Girls with 1...
Mahipalpur Call Girls Delhi šŸ”„ 9711199012 ā„- Pick Your Dream Call Girls with 1...
Ā 
CSP_Study - Notes (Paul McNeill) 2017.pdf
CSP_Study - Notes (Paul McNeill) 2017.pdfCSP_Study - Notes (Paul McNeill) 2017.pdf
CSP_Study - Notes (Paul McNeill) 2017.pdf
Ā 
Kandivali Call Girls ā˜‘ +91-9967584737 ā˜‘ Available Hot Girls Aunty Book Now
Kandivali Call Girls ā˜‘ +91-9967584737 ā˜‘ Available Hot Girls Aunty Book NowKandivali Call Girls ā˜‘ +91-9967584737 ā˜‘ Available Hot Girls Aunty Book Now
Kandivali Call Girls ā˜‘ +91-9967584737 ā˜‘ Available Hot Girls Aunty Book Now
Ā 
Online train ticket booking system project.pdf
Online train ticket booking system project.pdfOnline train ticket booking system project.pdf
Online train ticket booking system project.pdf
Ā 
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdfAsymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Asymmetrical Repulsion Magnet Motor Ratio 6-7.pdf
Ā 
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdfSELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
Ā 
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Sachpazis_Consolidation Settlement Calculation Program-The Python Code and th...
Ā 
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptxMODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
MODULE 5 BIOLOGY FOR ENGINEERS TRENDS IN BIO ENGINEERING.pptx
Ā 
College Call Girls Kolkata šŸ”„ 7014168258 šŸ”„ Real Fun With Sexual Girl Available...
College Call Girls Kolkata šŸ”„ 7014168258 šŸ”„ Real Fun With Sexual Girl Available...College Call Girls Kolkata šŸ”„ 7014168258 šŸ”„ Real Fun With Sexual Girl Available...
College Call Girls Kolkata šŸ”„ 7014168258 šŸ”„ Real Fun With Sexual Girl Available...
Ā 
šŸ”„Young College Call Girls Chandigarh šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chan...
šŸ”„Young College Call Girls Chandigarh šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chan...šŸ”„Young College Call Girls Chandigarh šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chan...
šŸ”„Young College Call Girls Chandigarh šŸ’ÆCall Us šŸ” 7737669865 šŸ”šŸ’ƒIndependent Chan...
Ā 

Text Segmentation for Online Subjective Examination using Machine Learning

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 Ā© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 366 Text Segmentation for Online Subjective Examination Using Machine Learning Shahid Khan1, Rakshanda Chavan2 , Diksha Singh3, Tina Sajwan4 1,2,3,4 Modern Education Societyā€™s College of Engineering, Pune-411001 ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - This paper focuses on text segmentation for natural language using k-Nearest Neighbour (K-NN) classiļ¬er , which is a type of instance-based learning, or lazy learning, where the function is only approximated locally and all computation is deferred until classiļ¬cation. The Text segmentation divides written text into meaningful units, which is used by humans when reading text, and artiļ¬cial processes implemented in computers which are subject to natural language processing. K-NN computes the similarity measure among attributes to determine similarity between feature vectors after which K-NN is modiļ¬ed based on the similarity measure, this version is applied into the text segmentation task. The goal of this paper is to implement natural language processing using text segmentation which provides the beneļ¬ts. Key Words: K-NN, text segmentation, feature similarity, NLP 1.INTRODUCTION The text segmentation is deļ¬ned as process of segmenting automatically a large text into many parts based on its topic or content. The information retrieval (IR) systems tend to retrieve long texts which contain more than one topic, as very high relevant texts to the given query, so the long texts need to be segmented into text partitions topic by topic. The task of text segmentation is to partition the text into sentences and paragraphs and judge whether the topic boundary is put or not between two adjacent sentences or paragraph. In this task, the text is given as the input and segmented into paragraphs, a list of pairs of adjacent paragraphs is generated, and each pair is judged whether we put the topic boundary between them, or not. The task is interpreted into a binary classiļ¬cation where each pair of paragraphs is classiļ¬ed into separation or non-separation. The task may be interpreted into the binary classiļ¬cation where each sentence or paragraph pair into the transition to the different topic or the continuation of the identical topic. Some issues are caused by encoding texts into numerical vectors and computing their similarities based on only attribute values. This problem causes very high costs for processing each numerical vector representing a document in terms of time and system resources. Much more training examples are required proportionally to the dimension for avoiding overļ¬tting. The second problem is sparse distribution where each numerical vector has zero values dominantly. Let us mention what we propose in this research as some agenda. In this research, we assume that words are given as features of numerical vectors in encoding texts, and they have their semantic relations with others. Based on the assumption, we deļ¬ne the similarity measure for computing the similarity between feature vectors, considering both feature values and features. We modify the KNN into the version where both the feature similarity and the feature value similarity are used, and apply it to the classiļ¬cation task mapped from the text segmentation. As beneļ¬ts from this research, we expect its more tolerance to the sparse distributions and the potential avoidance of the huge dimensionality. Let us mention what is expected from this research as beneļ¬ts by implementing the above ideas. We may cut down the dimensionality in encoding texts into numerical vectors, potentially. The information loss in computing the similarity between texts may be reduced by reļ¬‚ecting the similarities among the features. We present some beneļ¬ts which are expected from this research. By representing the texts into alternative one to the numerical vectors, we may escape from the two main problems in doing so. The proposed approach becomes less sensitive to the sparse distribution of numerical vectors, because the similarity among features is captured as well as among feature values. 2. RELATED WORK Let us survey the previous cases of encoding texts into structured forms for using the machine learning algorithms for text mining tasks. The three main problems, huge dimensionality, sparse distribution, and poor transparency, have existed inherently in encoding them into numerical vectors. In previous works, various schemes of pre-processing texts have been proposed, in order to solve the problems. In paper [1], it is given that text segmentation refers to the process of segmenting an article into its several parts based on its content. Because in the information retrieval systems, a long text tends to be retrieved most frequently by overestimation of its relevancy to a query, we need to segment it into its several parts, in order to avoid the
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 Ā© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 367 problem. In this task, the text is given as the input and segmented into paragraphs, a list of pairs of adjacent paragraphs is generated, and each pair is judged whether we put the topic boundary between them, or not. In paper [2], The task of text segmentation is to partition the text into sentences and paragraphs and judge whether the topic boundary is put or not between two adjacent sentences or paragraph. The task may be interpreted into the binary classiļ¬cation where each sentence or paragraph pair into the transition to the different topic or the continuation of the identical topic. Segmentation of speech texts into sentences or paragraphs may be considered but covered in the next research. In the text categorization, the sample texts may span over various domains, whereas in the text segmentation, the sample paragraphs should be within a domain. Therefore, although the text segmentation belongs to the classiļ¬cation task, it should be distinguished from the topic based text categorization. The text segmentation is mapped into a binary classiļ¬cation. In paper [3], the application of the back propagation to the judgment of keywords is validated restrictedly. The deļ¬nition of the back propagation to the judgment of keywords may be considered in various ways. The Information systems dealing with documents, such as Knowledge Management (KM), Information Retrieval (IR) and Digital Library (DL) systems require the storage of documents and structured data, called the document surrogate, associated with documents. Documents are written in natural language and cannot be processed directly by computers. A typical document surrogate, which is converted from the natural language document by computer, contains indices of the document and includes main words reļ¬‚ecting the contents. Indexing deļ¬nes the process of converting a document into a list of words included in it. This paper proposed the application of back propagation and consideration of more factors with the addition to TF (Term Frequency) and IDF (Inverse Document Frequency). Paper [4], states that text categorization is the process of assigning one or some among predeļ¬ned categories to each document. The task belongs to pattern classiļ¬cation where texts or documents are given as patterns. Note that almost information in any system is given as textual formats dominantly over numerical one. For managing efļ¬ciently the kind of information given as the textual format, techniques of text categorization are necessary; text categorization became a very interesting research topic in both academic and industrial worlds. In this version of the proposed text categorization system, the number of entries of tables is ļ¬xed constantly. The proposed one is called static index based approach. However, the optimal number of entries is very dependent on the given document or corpus. The size of each table should be optimized in terms of two factors: reliability and efļ¬ciency. In paper [5], authors tried to understand the automated categorization (or classiļ¬cation) of texts into predeļ¬ned categories has witnessed a booming interest in the last 10 years, due to the increased availability of documents in digital form and the ensuing need to organize them. It is important to bear in mind that the considerations above are not absolute statements (if there may be any) on the comparative effectiveness of these TC methods. One of the reasons is that a particular applicative context may exhibit very different characteristics from the ones to be found in Reuters, and different classiļ¬ers may respond differently to these characteristics. An experimental study by Joachims [1998] involving support vector machines, k-NN, decision trees, Rocchio, and Naive Bayes, showed all these classiļ¬ers to have similar effectiveness on categories with 300 positive training examples each. The fact that this experiment involved the methods which have scored best (support vector machines, k-NN) and worst (Rocchio and Naive Bayes).Most popular approach to TC, at least in the operational (i.e., real world applications) community, was a knowledge engineering (KE). In paper [6], the authors have studied that text clustering refers to the process of segmenting a particular group of documents into sub groups each of which contains content based similar documents. A collection or group of documents is given as the input of the task. Several smaller groups of content-based similar documents are generated from the task as its output. Although there are many heuristic approaches to the task, unsupervised learning algorithms have been used as state of the art approaches to it. The process of encoding documents into numerical vectors for using traditional unsupervised learning algorithms for text clustering causes the two main problems. The ļ¬rst problem is huge dimensionality where documents must be encoded into very large dimensional numerical vectors for preventing information loss. In general, documents must be encoded at least into several hundreds dimensional numerical vectors in previous literatures. This problem causes very expensive cost for processing each numerical vector representing a document in terms of time and system resources. Furthermore, much more training examples are required proportionally to the dimension for avoiding overļ¬tting.The second problem is sparse distribution where each numerical vector has zero values dominantly. In other words, more than 90 degree 0 of its elements are zero values in each numerical vector. This phenomenon degrades the discrimination among numerical vectors. This causes poor performance of text categorization or text clustering. In order to improve performance of both tasks, the two problems should be solved. 3. PROPOSED SYSTEM KNN Classiļ¬er: This section tells about the KNN classiļ¬er which is an algorithm used for text segmentation. It keeps the record
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 Ā© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 368 of all the previous cases and another unknown case is been classiļ¬ed. It is a type of supervised learning. The unknown case is been classiļ¬ed by the maximum votes of its K nearest neighbours. It is a kind of Machine Learning algorithm and also one of the simplest algorithm used for classiļ¬cation. It considers the similarity between the attributes of the answers written by the user and then computes the similarities between the features of the answer and specimen answers. In this research, we encode sentence pairs or paragraph pairs into string vectors, and apply the string vector based version of KNN to the classiļ¬cation task mapped from the text segmentation NLP: This section is concerned with Natural Language Processing which is a ļ¬eld of AI(Artiļ¬cial Intelligence). It is about the co-operation between the computer and the Natural Language used by humans. NLP is helpful in solving many problems like machine translation and text segmentation. Text Segmentation: This section is concerned about Text segmentation which is the process where the text which is been written is divided into small parts. The term applies both to mental processes used by humans when reading text, and to artiļ¬cial processes implemented in computers, which are the subject of natural language processing. It is very helpful in assisting computers so that it is possible for the computers to do artiļ¬cial things. It is a precursor Natural Language Processing. Text Segmentation recognizes the boundaries in between the words. Data Store: This section tells us about the role of data store in the process. A data store is a repository for storing collections of data, such as database. A data store is basically a connection to the repository of data, whether the data is stored in a single database or in one more different ļ¬les. The data store can be used to gain data or you can export the data from results and then store it in the data store, or both. The data collected from the users is stored in the data store. For the processing the data stored in the data store is processed and stored back into the data store for the users to retrieve their processed data whenever he wants. Hence data store plays a major role in the entire process. For the data to be stored in the data store it need not compulsorily be arranged in some relational format. Fig -1: Architecture Diagram 4. CONCLUSIONS An examination system is developed based on the web. This paper describes the principle of the system, presents the main functions of the system, analyzes the auto- generating test paper algorithm, and discusses the security of the system. With the help of the algorithm we can conduct online subjective exams anywhere and everywhere. It saves time as it allows number of students to give the exam at a time and displays the results as the test gets over, so no need to wait for the result. It is automatically generated by the server. Staff has a privilege to create, modify and delete the test papers and its particular questions. Student can register, login and give the test with his speciļ¬c id, and can see the results as well. ACKNOWLEDGEMENT We thank our guide Prof. A. D. Dhawale for his guidance and support. REFERENCES 1) Taeho Jo, ā€œUsing K Nearest Neighbors for Text Segmentation with Feature Similarityā€, International Conference on Communication, Control, Computing and Electronics Engineering (ICCCCEE), 2017.
  • 4. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 04 Issue: 12 | Dec-2017 www.irjet.net p-ISSN: 2395-0072 Ā© 2017, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 369 2) Taeho Jo, ā€œContent based Segmentation of Texts using Table based KNNā€, IKE, 2017. 3) Taeho Jo, Malrey Lee , and Thomas M Gatton, ā€œKeyword Extraction from Documents Using a Neural Network Modelā€, IEEE, 2016. 4) T. Jo, ā€œNTC (Neural Text Categorizer): Neural Network for Text Categorizationā€, pp83-96, International Journal of Information Studies, Vol 2, No 2, 2010. 5) T. Jo, ā€œNormalized Table Matching Algorithm as Approach to Text Categorizationā€, pp839-849, Soft Computing, Vol 19, No 4, 2015. 6) T. Jo, ā€œSingle Pass Algorithm for Text Clustering by Encoding Documents into Tablesā€, pp1749-1757, Journal of Korea Multimedia Society, Vol 11, No 12, 2008. 7) T. Jo and D. Cho, ā€œIndex based Approach for Text Categorizationā€, International Journal of Mathematics and Computers in Simulation, Vol 2, No 1, 2008. 8) H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins, ā€œText Classiļ¬cation with String Kernelsā€, pp419-444, Journal of Machine Learning Research, Vol 2, No 2, 2002. 9) F. Sebastiani, ā€œMachine Learning in Automated Text Categorizationā€, pp1-47, ACM Computing Survey, Vol 34, No 1, 2002 10) T. Jo, ā€œRepresentation of Texts into String Vectors for Text Categorizationā€, pp110-127, Journal of Computing Science and Engineering, Vol 4, No 2, 2010.
  ēæ»čƑļ¼š