尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
BY
N. SUMANJALI
DPT OF LIS
PONDICHERRY UNIVERSITY
INFORMATION RETRIEVAL
 Information retrieval is the activity of obtaining
information resources relevant to an information need
from a collection of information resources.
 Searches can be based on metadata or on full-text (or
other content-based) indexing.
 Goal: Find the documents most relevant to a certain Query
 Dealing with notions of:
 Collection of documents
 Query (User’s information need)
 Notion of Relevancy
MODEL
 A model is a construct designed help us understand a
complex system
 A particular way of “looking at things”
 Models inevitably make simplifying assumptions
 What are the limitations of the model?
 Different types of models:
 Conceptual models
 Physical analog models
 Mathematical models
Retrieval Models
A retrieval model specifies the details
of:
 Document representation
 Query representation
 Retrieval function
Determines a notion of relevance.
Notion of relevance can be binary or
continuous (i.e. ranked retrieval).
CLASSES OF RM
Boolean models (set theoretic)
 Extended Boolean
Vector space models
(statistical/algebraic)
 Generalized VS
 Latent Semantic Indexing
Probabilistic models
MODELS OF IR
 Boolean model
 Based on the notion of sets
 Documents are retrieved only if they satisfy Boolean
conditions specified in the query
 Does not impose a ranking on retrieved documents
 Exact match
 Vector space model
 Based on geometry, the notion of vectors in high dimensional
space
 Documents are ranked based on their similarity to the query
(ranked retrieval)
 Best/partial match
 Language models
 Based on the notion of probabilities and processes for
generating text
 Documents are ranked based on the probability that
they generated the query
 Best/partial match
BOOLEAN MODEL
 Invented by George Boole (1815-1864)
 He devised a system of symbolic logic in which he used
three operators (+, , - ) to combine statements in
symbolic form.
 John Venn named to this operators of Boolean logic
are the logical sum(+), logical product(), and logical
difference(-).
 IR systems allow the users to express their queries by
using this operators.
BOOLEAN MODEL
 Each index term is either present or absent
 Documents are either Relevant or Not Relevant(no
ranking)
 A document is represented as a set of keywords.
 Queries are Boolean expressions of
keywords, connected by AND, OR, and
NOT, including the use of brackets to indicate scope.
 [[Rio & Brazil] | [Hilo & Hawaii]] & hotel & !Hilton]
 Output: Document is relevant or not. No partial
matches or ranking.
BOOLEAN RETRIEVAL MODEL
 Popular retrieval model because:
 Easy to understand for simple queries.
 Clean formalism.
 Boolean models can be extended to include ranking.
 Reasonably efficient implementations possible for
normal queries.
BOOLEAN MODEL
 Weights assigned to terms are either “0” or “1”
 “0” represents “absence”: term isn’t in the document
 “1” represents “presence”: term is in the document
 Build queries by combining terms with Boolean
operators
 AND, OR, NOT
 The system returns all documents that satisfy the
query
AND/OR/NOT
A B
C
Why Boolean Retrieval Works
 Boolean operators approximate natural language
 Find documents about a good party that is not over
 AND can discover relationships between concepts
 good party
 OR can discover alternate terminology
 excellent party, wild party, etc.
 NOT can discover alternate meanings
 Democratic party
The Perfect Query Paradox
 Every information need has a perfect set of documents
 If not, there would be no sense doing retrieval
 Every document set has a perfect query
 AND every word in a document to get a query for it
 Repeat for each document in the set
 OR every document query to get the set query
 But can users realistically be expected to formulate this
perfect query?
 Boolean query formulation is hard!
Why Boolean Retrieval Fails
• Natural language is way more complex
• AND “discovers” nonexistent relationships
– Terms in different sentences, paragraphs, …
• Guessing terminology for OR is hard
– good, nice, excellent, outstanding, awesome, …
• Guessing terms to exclude is even harder!
– Democratic party, party to a lawsuit, …
BOOLEAN MODEL
 Strengths
 Precise, if you know the right strategies
 Precise, if you have an idea of what you’re looking for
 Efficient for the computer
 Simple
 Weaknesses
 Users must learn Boolean logic
 Boolean logic insufficient to capture the richness of language
 No control over size of result set: either too many documents or none
 When do you stop reading? All documents in the result set are
considered “equally good”
 What about partial matches? Documents that “don’t quite match” the
query may be useful also
 No notion of ranking (exact matching only)
 All index terms have equal weight
PROBLEMS
 Very rigid: AND means all; OR means any.
 Difficult to express complex user requests.
 Difficult to control the number of documents retrieved.
 All matched documents will be returned.
 Difficult to rank output.
 All matched documents logically satisfy the query.
 Difficult to perform relevance feedback.
 If a document is identified by the user as relevant or
irrelevant, how should the query be modified?
ADVANTAGES & DISADVANTAGES
 Advantages
 Results are predictable, relatively easy to explain
 Many different features can be incorporated
 Efficient processing since many documents can be
eliminated from search
 Disadvantages
 Effectiveness depends entirely on user
 Simple queries usually don’t work well
 Complex queries are difficult.
LIMITATIONS
 The first relates to the formulation of search statements.
 It has been noted that users are not able to formulate an exact search
statement by the combination of AND, OR and NOT operators,
especially when several query terms are involved.
 In such cases either the search statement becomes too narrow or too
broad.
 The second limitation relates to the number of retrieval items.
 It has been noted that users cannot predict a priori exactly how many
items are to be retrieved to satisfy a given query.
 If the search statement is broad, the number of retrieved items may
sometimes be several hundreds and thus it may be quite difficult to
find out the exact information required.
 The third limitation is that it identifies an item as relevant by finding
out whether a given query term is present or not in a given record in the
database.
Model  of information retrieval (3)

More Related Content

What's hot

Vector space model in information retrieval
Vector space model in information retrievalVector space model in information retrieval
Vector space model in information retrieval
Tharuka Vishwajith Sarathchandra
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
silambu111
 
Information Retrieval Evaluation
Information Retrieval EvaluationInformation Retrieval Evaluation
Information Retrieval Evaluation
José Ramón Ríos Viqueira
 
Vector space model of information retrieval
Vector space model of information retrievalVector space model of information retrieval
Vector space model of information retrieval
Nanthini Dominique
 
Inverted index
Inverted indexInverted index
Inverted index
Krishna Gehlot
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean model
Vaibhav Khanna
 
The vector space model
The vector space modelThe vector space model
The vector space model
pkgosh
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
El Habib NFAOUI
 
Automatic indexing
Automatic indexingAutomatic indexing
Automatic indexing
dhatchayaninandu
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Vikas Bhushan
 
Information retrieval 13 alternative set theoretic models
Information retrieval 13 alternative set theoretic modelsInformation retrieval 13 alternative set theoretic models
Information retrieval 13 alternative set theoretic models
Vaibhav Khanna
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
Roi Blanco
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
KU Leuven
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notes
BAIRAVI T
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on ir
Primya Tamil
 
Information Retrieval
Information RetrievalInformation Retrieval
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & Models
Mounia Lalmas-Roelleke
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
nimmyjans4
 
Evaluation in Information Retrieval
Evaluation in Information RetrievalEvaluation in Information Retrieval
Evaluation in Information Retrieval
Dishant Ailawadi
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
Sai Kumar Ale
 

What's hot (20)

Vector space model in information retrieval
Vector space model in information retrievalVector space model in information retrieval
Vector space model in information retrieval
 
Information retrieval s
Information retrieval sInformation retrieval s
Information retrieval s
 
Information Retrieval Evaluation
Information Retrieval EvaluationInformation Retrieval Evaluation
Information Retrieval Evaluation
 
Vector space model of information retrieval
Vector space model of information retrievalVector space model of information retrieval
Vector space model of information retrieval
 
Inverted index
Inverted indexInverted index
Inverted index
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean model
 
The vector space model
The vector space modelThe vector space model
The vector space model
 
Information_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_HabibInformation_Retrieval_Models_Nfaoui_El_Habib
Information_Retrieval_Models_Nfaoui_El_Habib
 
Automatic indexing
Automatic indexingAutomatic indexing
Automatic indexing
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
 
Information retrieval 13 alternative set theoretic models
Information retrieval 13 alternative set theoretic modelsInformation retrieval 13 alternative set theoretic models
Information retrieval 13 alternative set theoretic models
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Tdm information retrieval
Tdm information retrievalTdm information retrieval
Tdm information retrieval
 
Information retrieval-systems notes
Information retrieval-systems notesInformation retrieval-systems notes
Information retrieval-systems notes
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on ir
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
Introduction to Information Retrieval & Models
Introduction to Information Retrieval & ModelsIntroduction to Information Retrieval & Models
Introduction to Information Retrieval & Models
 
Information retrieval introduction
Information retrieval introductionInformation retrieval introduction
Information retrieval introduction
 
Evaluation in Information Retrieval
Evaluation in Information RetrievalEvaluation in Information Retrieval
Evaluation in Information Retrieval
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
 

Viewers also liked

Information storage and retrieval
Information storage and retrievalInformation storage and retrieval
Information storage and retrieval
Sadaf Rafiq
 
Copyright issues in a library digital environment
Copyright issues in a library digital environmentCopyright issues in a library digital environment
Copyright issues in a library digital environment
Fe Angela Verzosa
 
Boolean Matching in Logic Synthesis
Boolean Matching in Logic SynthesisBoolean Matching in Logic Synthesis
Boolean Matching in Logic Synthesis
Iffat Anjum
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
Patrice Bellot - Aix-Marseille Université / CNRS (LIS, INS2I)
 
Ir models
Ir modelsIr models
Ir models
Ambreen Angel
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
Carsten Eickhoff
 
SemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialSemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorial
Peter Mika
 
Models for Information Retrieval and Recommendation
Models for Information Retrieval and RecommendationModels for Information Retrieval and Recommendation
Models for Information Retrieval and Recommendation
Arjen de Vries
 
Planning and Implementing a Digital Library Project
Planning and Implementing a Digital Library ProjectPlanning and Implementing a Digital Library Project
Planning and Implementing a Digital Library Project
Jenn Riley
 
E-RESOURCES
E-RESOURCESE-RESOURCES
E-RESOURCES
jasminshamnad
 
Proofreading and Editing
Proofreading and EditingProofreading and Editing
Proofreading and Editing
Molly Amell
 
Information Retrieval Models Part I
Information Retrieval Models Part IInformation Retrieval Models Part I
Information Retrieval Models Part I
Ingo Frommholz
 
Query formulation process
Query formulation processQuery formulation process
Query formulation process
malathimurugan
 
Ir for it&ites
Ir for it&itesIr for it&ites
Ir for it&ites
Punam Jagtap
 
Information Consolidation
Information ConsolidationInformation Consolidation
Information Consolidation
Kishor Sakariya
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)
Kira
 
Proof reading, editing and revising by sohail ahmed
Proof reading, editing and revising by sohail ahmedProof reading, editing and revising by sohail ahmed
Proof reading, editing and revising by sohail ahmed
Sohail Ahmed Solangi
 
Editing ppt
Editing pptEditing ppt
Editing ppt
awatkin
 
RTOS- Real Time Operating Systems
RTOS- Real Time Operating Systems RTOS- Real Time Operating Systems
RTOS- Real Time Operating Systems
Bayar shahab
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Matthew Lease
 

Viewers also liked (20)

Information storage and retrieval
Information storage and retrievalInformation storage and retrieval
Information storage and retrieval
 
Copyright issues in a library digital environment
Copyright issues in a library digital environmentCopyright issues in a library digital environment
Copyright issues in a library digital environment
 
Boolean Matching in Logic Synthesis
Boolean Matching in Logic SynthesisBoolean Matching in Logic Synthesis
Boolean Matching in Logic Synthesis
 
Some Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBASome Information Retrieval Models and Our Experiments for TREC KBA
Some Information Retrieval Models and Our Experiments for TREC KBA
 
Ir models
Ir modelsIr models
Ir models
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
SemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorialSemTech 2011 Semantic Search tutorial
SemTech 2011 Semantic Search tutorial
 
Models for Information Retrieval and Recommendation
Models for Information Retrieval and RecommendationModels for Information Retrieval and Recommendation
Models for Information Retrieval and Recommendation
 
Planning and Implementing a Digital Library Project
Planning and Implementing a Digital Library ProjectPlanning and Implementing a Digital Library Project
Planning and Implementing a Digital Library Project
 
E-RESOURCES
E-RESOURCESE-RESOURCES
E-RESOURCES
 
Proofreading and Editing
Proofreading and EditingProofreading and Editing
Proofreading and Editing
 
Information Retrieval Models Part I
Information Retrieval Models Part IInformation Retrieval Models Part I
Information Retrieval Models Part I
 
Query formulation process
Query formulation processQuery formulation process
Query formulation process
 
Ir for it&ites
Ir for it&itesIr for it&ites
Ir for it&ites
 
Information Consolidation
Information ConsolidationInformation Consolidation
Information Consolidation
 
Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)Tutorial 1 (information retrieval basics)
Tutorial 1 (information retrieval basics)
 
Proof reading, editing and revising by sohail ahmed
Proof reading, editing and revising by sohail ahmedProof reading, editing and revising by sohail ahmed
Proof reading, editing and revising by sohail ahmed
 
Editing ppt
Editing pptEditing ppt
Editing ppt
 
RTOS- Real Time Operating Systems
RTOS- Real Time Operating Systems RTOS- Real Time Operating Systems
RTOS- Real Time Operating Systems
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 

Similar to Model of information retrieval (3)

Interview_Search_Process (1).pptx
Interview_Search_Process (1).pptxInterview_Search_Process (1).pptx
Interview_Search_Process (1).pptx
AbhinayRaparthi
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation
RIILP
 
An empirical performance evaluation of relational keyword search systems
An empirical performance evaluation of relational keyword search systemsAn empirical performance evaluation of relational keyword search systems
An empirical performance evaluation of relational keyword search systems
Browse Jobs
 
Planning and Preparing to Search
Planning and Preparing to SearchPlanning and Preparing to Search
Planning and Preparing to Search
UniM_Librarian
 
The comparative study of information retrieval models used in search engines
The comparative study of information retrieval models used in search enginesThe comparative study of information retrieval models used in search engines
The comparative study of information retrieval models used in search engines
fawad khan
 
Sub1579
Sub1579Sub1579
Polyrepresentation in a Quantum-inspired Information Retrieval Framework
Polyrepresentation in a Quantum-inspired Information Retrieval FrameworkPolyrepresentation in a Quantum-inspired Information Retrieval Framework
Polyrepresentation in a Quantum-inspired Information Retrieval Framework
Ingo Frommholz
 
Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and Federation
Pistoia Alliance
 
Chapter 1 Introduction to Information Storage and Retrieval.pdf
Chapter 1 Introduction to Information Storage and Retrieval.pdfChapter 1 Introduction to Information Storage and Retrieval.pdf
Chapter 1 Introduction to Information Storage and Retrieval.pdf
Habtamu100
 
Edad 695 research methodology
Edad 695 research methodologyEdad 695 research methodology
Edad 695 research methodology
Scott Lancaster
 
Recommenders, Topics, and Text
Recommenders, Topics, and TextRecommenders, Topics, and Text
Recommenders, Topics, and Text
NBER
 
Reference: NYLA Library Assistants Training
Reference: NYLA Library Assistants TrainingReference: NYLA Library Assistants Training
Reference: NYLA Library Assistants Training
Sarah Maximiek
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Mauro Dragoni
 
[IJET-V2I3P19] Authors: Priyanka Sharma
[IJET-V2I3P19] Authors: Priyanka Sharma[IJET-V2I3P19] Authors: Priyanka Sharma
[IJET-V2I3P19] Authors: Priyanka Sharma
IJET - International Journal of Engineering and Techniques
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
NBER
 
Modern information Retrieval-Relevance Feedback
Modern information Retrieval-Relevance FeedbackModern information Retrieval-Relevance Feedback
Modern information Retrieval-Relevance Feedback
HasanulFahmi2
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithm
Rupali Bhatnagar
 
TheoreticalFramework
TheoreticalFrameworkTheoreticalFramework
TheoreticalFramework
Ogunleye Samuel
 
Hinari basic course_module_2_workbook_2014_07
Hinari basic course_module_2_workbook_2014_07Hinari basic course_module_2_workbook_2014_07
Hinari basic course_module_2_workbook_2014_07
Aslam Mehdi
 
There are 8 discussions needed in 3 days (72 hours). I added the lis.docx
There are 8 discussions needed in 3 days (72 hours). I added the lis.docxThere are 8 discussions needed in 3 days (72 hours). I added the lis.docx
There are 8 discussions needed in 3 days (72 hours). I added the lis.docx
susannr
 

Similar to Model of information retrieval (3) (20)

Interview_Search_Process (1).pptx
Interview_Search_Process (1).pptxInterview_Search_Process (1).pptx
Interview_Search_Process (1).pptx
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation
 
An empirical performance evaluation of relational keyword search systems
An empirical performance evaluation of relational keyword search systemsAn empirical performance evaluation of relational keyword search systems
An empirical performance evaluation of relational keyword search systems
 
Planning and Preparing to Search
Planning and Preparing to SearchPlanning and Preparing to Search
Planning and Preparing to Search
 
The comparative study of information retrieval models used in search engines
The comparative study of information retrieval models used in search enginesThe comparative study of information retrieval models used in search engines
The comparative study of information retrieval models used in search engines
 
Sub1579
Sub1579Sub1579
Sub1579
 
Polyrepresentation in a Quantum-inspired Information Retrieval Framework
Polyrepresentation in a Quantum-inspired Information Retrieval FrameworkPolyrepresentation in a Quantum-inspired Information Retrieval Framework
Polyrepresentation in a Quantum-inspired Information Retrieval Framework
 
Resource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and FederationResource Description Framework Approach to Data Publication and Federation
Resource Description Framework Approach to Data Publication and Federation
 
Chapter 1 Introduction to Information Storage and Retrieval.pdf
Chapter 1 Introduction to Information Storage and Retrieval.pdfChapter 1 Introduction to Information Storage and Retrieval.pdf
Chapter 1 Introduction to Information Storage and Retrieval.pdf
 
Edad 695 research methodology
Edad 695 research methodologyEdad 695 research methodology
Edad 695 research methodology
 
Recommenders, Topics, and Text
Recommenders, Topics, and TextRecommenders, Topics, and Text
Recommenders, Topics, and Text
 
Reference: NYLA Library Assistants Training
Reference: NYLA Library Assistants TrainingReference: NYLA Library Assistants Training
Reference: NYLA Library Assistants Training
 
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information RetrievalKeystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
Keystone Summer School 2015: Mauro Dragoni, Ontologies For Information Retrieval
 
[IJET-V2I3P19] Authors: Priyanka Sharma
[IJET-V2I3P19] Authors: Priyanka Sharma[IJET-V2I3P19] Authors: Priyanka Sharma
[IJET-V2I3P19] Authors: Priyanka Sharma
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
 
Modern information Retrieval-Relevance Feedback
Modern information Retrieval-Relevance FeedbackModern information Retrieval-Relevance Feedback
Modern information Retrieval-Relevance Feedback
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithm
 
TheoreticalFramework
TheoreticalFrameworkTheoreticalFramework
TheoreticalFramework
 
Hinari basic course_module_2_workbook_2014_07
Hinari basic course_module_2_workbook_2014_07Hinari basic course_module_2_workbook_2014_07
Hinari basic course_module_2_workbook_2014_07
 
There are 8 discussions needed in 3 days (72 hours). I added the lis.docx
There are 8 discussions needed in 3 days (72 hours). I added the lis.docxThere are 8 discussions needed in 3 days (72 hours). I added the lis.docx
There are 8 discussions needed in 3 days (72 hours). I added the lis.docx
 

Recently uploaded

TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
Ortus Solutions, Corp
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
Pablo Gómez Abajo
 
So You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental DowntimeSo You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental Downtime
ScyllaDB
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
ScyllaDB
 
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDBScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB
 
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
manji sharman06
 
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My IdentityCNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
Cynthia Thomas
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
UiPathCommunity
 
APJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes WebinarAPJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes Webinar
ThousandEyes
 
Introduction to ThousandEyes AMER Webinar
Introduction  to ThousandEyes AMER WebinarIntroduction  to ThousandEyes AMER Webinar
Introduction to ThousandEyes AMER Webinar
ThousandEyes
 
An All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS MarketAn All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS Market
ScyllaDB
 
Building a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data PlatformBuilding a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data Platform
Enterprise Knowledge
 
CTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database MigrationCTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database Migration
ScyllaDB
 
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
anilsa9823
 
Automation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI AutomationAutomation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI Automation
UiPathCommunity
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
Enterprise Knowledge
 
Discover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched ContentDiscover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched Content
ScyllaDB
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving
 
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State StoreElasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
ScyllaDB
 

Recently uploaded (20)

TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
TrustArc Webinar - Your Guide for Smooth Cross-Border Data Transfers and Glob...
 
Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!Introducing BoxLang : A new JVM language for productivity and modularity!
Introducing BoxLang : A new JVM language for productivity and modularity!
 
Mutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented ChatbotsMutation Testing for Task-Oriented Chatbots
Mutation Testing for Task-Oriented Chatbots
 
So You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental DowntimeSo You've Lost Quorum: Lessons From Accidental Downtime
So You've Lost Quorum: Lessons From Accidental Downtime
 
ScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking ReplicationScyllaDB Tablets: Rethinking Replication
ScyllaDB Tablets: Rethinking Replication
 
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDBScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
ScyllaDB Leaps Forward with Dor Laor, CEO of ScyllaDB
 
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
Call Girls Chandigarh🔥7023059433🔥Agency Profile Escorts in Chandigarh Availab...
 
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My IdentityCNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
 
Day 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio FundamentalsDay 2 - Intro to UiPath Studio Fundamentals
Day 2 - Intro to UiPath Studio Fundamentals
 
APJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes WebinarAPJC Introduction to ThousandEyes Webinar
APJC Introduction to ThousandEyes Webinar
 
Introduction to ThousandEyes AMER Webinar
Introduction  to ThousandEyes AMER WebinarIntroduction  to ThousandEyes AMER Webinar
Introduction to ThousandEyes AMER Webinar
 
An All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS MarketAn All-Around Benchmark of the DBaaS Market
An All-Around Benchmark of the DBaaS Market
 
Building a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data PlatformBuilding a Semantic Layer of your Data Platform
Building a Semantic Layer of your Data Platform
 
CTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database MigrationCTO Insights: Steering a High-Stakes Database Migration
CTO Insights: Steering a High-Stakes Database Migration
 
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
Call Girls Chennai ☎️ +91-7426014248 😍 Chennai Call Girl Beauty Girls Chennai...
 
Automation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI AutomationAutomation Student Developers Session 3: Introduction to UI Automation
Automation Student Developers Session 3: Introduction to UI Automation
 
Demystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through StorytellingDemystifying Knowledge Management through Storytelling
Demystifying Knowledge Management through Storytelling
 
Discover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched ContentDiscover the Unseen: Tailored Recommendation of Unwatched Content
Discover the Unseen: Tailored Recommendation of Unwatched Content
 
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
Northern Engraving | Modern Metal Trim, Nameplates and Appliance Panels
 
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State StoreElasticity vs. State? Exploring Kafka Streams Cassandra State Store
Elasticity vs. State? Exploring Kafka Streams Cassandra State Store
 

Model of information retrieval (3)

  • 1. BY N. SUMANJALI DPT OF LIS PONDICHERRY UNIVERSITY
  • 2. INFORMATION RETRIEVAL  Information retrieval is the activity of obtaining information resources relevant to an information need from a collection of information resources.  Searches can be based on metadata or on full-text (or other content-based) indexing.  Goal: Find the documents most relevant to a certain Query  Dealing with notions of:  Collection of documents  Query (User’s information need)  Notion of Relevancy
  • 3. MODEL  A model is a construct designed help us understand a complex system  A particular way of “looking at things”  Models inevitably make simplifying assumptions  What are the limitations of the model?  Different types of models:  Conceptual models  Physical analog models  Mathematical models
  • 4. Retrieval Models A retrieval model specifies the details of:  Document representation  Query representation  Retrieval function Determines a notion of relevance. Notion of relevance can be binary or continuous (i.e. ranked retrieval).
  • 5. CLASSES OF RM Boolean models (set theoretic)  Extended Boolean Vector space models (statistical/algebraic)  Generalized VS  Latent Semantic Indexing Probabilistic models
  • 6. MODELS OF IR  Boolean model  Based on the notion of sets  Documents are retrieved only if they satisfy Boolean conditions specified in the query  Does not impose a ranking on retrieved documents  Exact match  Vector space model  Based on geometry, the notion of vectors in high dimensional space  Documents are ranked based on their similarity to the query (ranked retrieval)  Best/partial match
  • 7.  Language models  Based on the notion of probabilities and processes for generating text  Documents are ranked based on the probability that they generated the query  Best/partial match
  • 8. BOOLEAN MODEL  Invented by George Boole (1815-1864)  He devised a system of symbolic logic in which he used three operators (+, , - ) to combine statements in symbolic form.  John Venn named to this operators of Boolean logic are the logical sum(+), logical product(), and logical difference(-).  IR systems allow the users to express their queries by using this operators.
  • 9. BOOLEAN MODEL  Each index term is either present or absent  Documents are either Relevant or Not Relevant(no ranking)  A document is represented as a set of keywords.  Queries are Boolean expressions of keywords, connected by AND, OR, and NOT, including the use of brackets to indicate scope.  [[Rio & Brazil] | [Hilo & Hawaii]] & hotel & !Hilton]  Output: Document is relevant or not. No partial matches or ranking.
  • 10. BOOLEAN RETRIEVAL MODEL  Popular retrieval model because:  Easy to understand for simple queries.  Clean formalism.  Boolean models can be extended to include ranking.  Reasonably efficient implementations possible for normal queries.
  • 11. BOOLEAN MODEL  Weights assigned to terms are either “0” or “1”  “0” represents “absence”: term isn’t in the document  “1” represents “presence”: term is in the document  Build queries by combining terms with Boolean operators  AND, OR, NOT  The system returns all documents that satisfy the query
  • 13. Why Boolean Retrieval Works  Boolean operators approximate natural language  Find documents about a good party that is not over  AND can discover relationships between concepts  good party  OR can discover alternate terminology  excellent party, wild party, etc.  NOT can discover alternate meanings  Democratic party
  • 14. The Perfect Query Paradox  Every information need has a perfect set of documents  If not, there would be no sense doing retrieval  Every document set has a perfect query  AND every word in a document to get a query for it  Repeat for each document in the set  OR every document query to get the set query  But can users realistically be expected to formulate this perfect query?  Boolean query formulation is hard!
  • 15. Why Boolean Retrieval Fails • Natural language is way more complex • AND “discovers” nonexistent relationships – Terms in different sentences, paragraphs, … • Guessing terminology for OR is hard – good, nice, excellent, outstanding, awesome, … • Guessing terms to exclude is even harder! – Democratic party, party to a lawsuit, …
  • 16. BOOLEAN MODEL  Strengths  Precise, if you know the right strategies  Precise, if you have an idea of what you’re looking for  Efficient for the computer  Simple  Weaknesses  Users must learn Boolean logic  Boolean logic insufficient to capture the richness of language  No control over size of result set: either too many documents or none  When do you stop reading? All documents in the result set are considered “equally good”  What about partial matches? Documents that “don’t quite match” the query may be useful also  No notion of ranking (exact matching only)  All index terms have equal weight
  • 17. PROBLEMS  Very rigid: AND means all; OR means any.  Difficult to express complex user requests.  Difficult to control the number of documents retrieved.  All matched documents will be returned.  Difficult to rank output.  All matched documents logically satisfy the query.  Difficult to perform relevance feedback.  If a document is identified by the user as relevant or irrelevant, how should the query be modified?
  • 18. ADVANTAGES & DISADVANTAGES  Advantages  Results are predictable, relatively easy to explain  Many different features can be incorporated  Efficient processing since many documents can be eliminated from search  Disadvantages  Effectiveness depends entirely on user  Simple queries usually don’t work well  Complex queries are difficult.
  • 19. LIMITATIONS  The first relates to the formulation of search statements.  It has been noted that users are not able to formulate an exact search statement by the combination of AND, OR and NOT operators, especially when several query terms are involved.  In such cases either the search statement becomes too narrow or too broad.  The second limitation relates to the number of retrieval items.  It has been noted that users cannot predict a priori exactly how many items are to be retrieved to satisfy a given query.  If the search statement is broad, the number of retrieved items may sometimes be several hundreds and thus it may be quite difficult to find out the exact information required.  The third limitation is that it identifies an item as relevant by finding out whether a given query term is present or not in a given record in the database.
  翻译: