尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
Chapter 2 Modeling
資工4B 86075800
陳建勳
Introduction.
 Traditional information retrieval systems
usually adopt index terms to index and retrieve
documents.
 An index term is a keyword(or group of related
words) which has some meaning of its own
(usually a noun).
The advantage of using index
terms
Simple
The semantic of the documents and of the
user information need can be naturally
expressed through sets of index terms.
 Ranking algorithms are at the core of information
retrieval systems(predicting which documents are
relevant and which are not).
A taxonomy of information retrieval
models
Retrieval:
Ad hoc
Filtering
Classic Models
Browsing
U
S
E
R
T
A
S
K
Boolean
Vector
Probabilistic
Structured Models
Non-overlapping lists
Proximal Nodes
Flat
Structured Guided
Hypertext
Browsing
Fuzzy
Extended Boolean
Set Theoretic
Algebraic
Generalized Vector
Lat. Semantic Index
Neural Networks
Inference Network
Belief Network
Probabilistic
Index Terms Full Text Full Text+
Structure
Retrieval Classic
Set Theoretic
Algebraic
Probabilistic
Classic
Set
Theoretic
Algebraic
Probabilistic
Structured
Browsing Flat Flat
Hypertext
Structure Guided
Hypertext
Figure 2.2 Retrieval models most frequently associated with distinct
combinations of a document logical view and a user task.
Retrieval : Ad hoc and Filtering
Ad hoc : The documents in the collection
remain relatively static while new queries
are submtted to the system.
Filtering : The queries remain relatively
static while new documents come into the
system
Filtering
Typically, the filtering task simply
indicates to the user the documents
which might be of interest to him.
Routing : Rank the filtering documents
and show this ranking to the user.
Constructing user profiles in two ways.
A formal characterization of IR models
D : A set composed of logical views(or
representation) for the documents in the
collection.
Q : A set composed of logical views(or
representation) for the user information
needs(queries).
F : A framework for modeling document
representations, queries, and their relationships.
R(qi, dj) : A ranking function which defines an
ordering among the documents with regard to the
query.
Classic information retrieval
model
Basic concepts : Each document is
described by a set of representative
keywords called index terms.
Assign a numerical weights to distinct
relevance between index terms.
Define
ki : A generic index term
K : The set of all index terms {k1,…,kt}
wi,j : A weight associated with index term
ki of a document dj
gi : A function returns the weight associated
with ki in any t-dimensoinal vector( gi(dj)=wi,j )
Boolean model
Based on a binary decision criterion without any
notion of a grading scale.
Boolean expressions have precise semantics.It is
not simple to translate an information need into
a Boolean expression.
Can be represented as a disjunction of
conjunction vectors(in disjunctive normal form-
DNF).
Vector model
Assign non-binary weights to index
terms in queries and in documents.
Compute the similarity between
documents and query.
More precise than Boolean model.
想法
We think of the documents as a collection C
of objects and think of the user query as a
specification of a set A of objects.In this
scenario, the IR problem can be reduced to
the problem of determine which documents
are in the set A and which ones are not(i.e.,
the IR problem can be viewed as a
clustering problem).
Intra-cluster : One needs to determine
what are the features which better
describe the objects in the set A.
Inter-cluster : One needs to determine
what are the features which better
distinguish the objects in the set A.
tf : inter-clustering similarity is quantified by
measuring the raw frequency of a term ki
inside a document dj, such term frequency is
usually referred to as the tf factor and
provides one measure of how well that term
describes the document contents.
idf : inter-clustering similarity is quantified by
measuring the inverse of the frequency of a
term ki among the documents in the
collection.This frequency is often referred to
as the inverse document frequency.
Vector model is simple and fast. It’s a
popular retrieval model.
Disadvantage : Index terms are
assumed to be mutually independent. It
doesn’t account for index term
dependencies.
Probabilistic model
 We can think of the querying process
as a process of specifying the properties
of an ideal answer set(The problem is
that we do not know exactly what these
properties are.).
Structured text retrieval model
 Retrieval models which combine information on
text content with information on the document
structure are called structured text retrieval
model.
 Match point : refer to the position in the text
of a sequence of words which matches the user
query.
 Region : refer to a contiguous portion of the
text.
 Node : refer to a structural component of the
document such as a chapter, a section, a
subsection.
Model based on Non-overlapping
lists
Divide the whole text of each document
in non-overlapping text regions which
are collected in a list.
Text regions in the same list have no
overlapping, but text regions from
distinct lists might overlap.
Model based on Proximal
nodes
A model which allows the definition of
independent hierarchical indexing
structures over the same document text.
Each of these index structures is a strict
hierarchy composed of chapters,
sections, paragraphs, pages, and lines
which called nodes.
Models for browsing
Flat browsing
Structure guided browsing
The hypertext model
Flat browsing
The documents might be represented
as dots in a plan or as elements in a list.
Relevance feedback
Disadvantage : In a given page or
screen there may not be any indication
about the context where the user is.
Structure guided browsing
Organized in a directory structure. It
groups documents covering related
topics.
The same idea can be applied to a
single document.
Using history map.
The hypertext model
Written text is usually conceived to be
read sequentially.
The reader should not expect to fully
understand the message conveyed by
the writer by randomly reading pieces
of text here and there.

More Related Content

What's hot

CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
Anandh Arumugakan
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean model
Vaibhav Khanna
 
Information Retrieval Evaluation
Information Retrieval EvaluationInformation Retrieval Evaluation
Information Retrieval Evaluation
José Ramón Ríos Viqueira
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
Sai Kumar Ale
 
The vector space model
The vector space modelThe vector space model
The vector space model
pkgosh
 
Inverted index
Inverted indexInverted index
Inverted index
Krishna Gehlot
 
Evaluation in Information Retrieval
Evaluation in Information RetrievalEvaluation in Information Retrieval
Evaluation in Information Retrieval
Dishant Ailawadi
 
Probabilistic retrieval model
Probabilistic retrieval modelProbabilistic retrieval model
Probabilistic retrieval model
baradhimarch81
 
Information retrieval (introduction)
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction)
Primya Tamil
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
Azad public school
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval system
Leslie Vargas
 
Term weighting
Term weightingTerm weighting
Term weighting
Primya Tamil
 
Information Retrieval
Information RetrievalInformation Retrieval
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
Krish_ver2
 
Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)
silambu111
 
Information retrieval 13 alternative set theoretic models
Information retrieval 13 alternative set theoretic modelsInformation retrieval 13 alternative set theoretic models
Information retrieval 13 alternative set theoretic models
Vaibhav Khanna
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on ir
Primya Tamil
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Vikas Bhushan
 
Lec1,2
Lec1,2Lec1,2
Lec1,2
alaa223
 
Web search vs ir
Web search vs irWeb search vs ir
Web search vs ir
Primya Tamil
 

What's hot (20)

CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 
Information retrieval 7 boolean model
Information retrieval 7 boolean modelInformation retrieval 7 boolean model
Information retrieval 7 boolean model
 
Information Retrieval Evaluation
Information Retrieval EvaluationInformation Retrieval Evaluation
Information Retrieval Evaluation
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
 
The vector space model
The vector space modelThe vector space model
The vector space model
 
Inverted index
Inverted indexInverted index
Inverted index
 
Evaluation in Information Retrieval
Evaluation in Information RetrievalEvaluation in Information Retrieval
Evaluation in Information Retrieval
 
Probabilistic retrieval model
Probabilistic retrieval modelProbabilistic retrieval model
Probabilistic retrieval model
 
Information retrieval (introduction)
Information  retrieval (introduction) Information  retrieval (introduction)
Information retrieval (introduction)
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Information retrieval system
Information retrieval systemInformation retrieval system
Information retrieval system
 
Term weighting
Term weightingTerm weighting
Term weighting
 
Information Retrieval
Information RetrievalInformation Retrieval
Information Retrieval
 
1.2 steps and functionalities
1.2 steps and functionalities1.2 steps and functionalities
1.2 steps and functionalities
 
Functions of information retrival system(1)
Functions of information retrival system(1)Functions of information retrival system(1)
Functions of information retrival system(1)
 
Information retrieval 13 alternative set theoretic models
Information retrieval 13 alternative set theoretic modelsInformation retrieval 13 alternative set theoretic models
Information retrieval 13 alternative set theoretic models
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on ir
 
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information RetrievalIndexing Techniques: Their Usage in Search Engines for Information Retrieval
Indexing Techniques: Their Usage in Search Engines for Information Retrieval
 
Lec1,2
Lec1,2Lec1,2
Lec1,2
 
Web search vs ir
Web search vs irWeb search vs ir
Web search vs ir
 

Similar to Information Retrieval Models

Vsm 벡터공간모델
Vsm 벡터공간모델Vsm 벡터공간모델
Vsm 벡터공간모델
guesta34d441
 
Vsm 벡터공간모델
Vsm 벡터공간모델Vsm 벡터공간모델
Vsm 벡터공간모델
JUNGEUN KANG
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithm
Rupali Bhatnagar
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
International Journal of Engineering Inventions www.ijeijournal.com
 
Barzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationBarzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentation
Richard Littauer
 
LEARNING CONTEXT FOR TEXT.pdf
LEARNING CONTEXT FOR TEXT.pdfLEARNING CONTEXT FOR TEXT.pdf
LEARNING CONTEXT FOR TEXT.pdf
IJDKP
 
Bt0066 dbms
Bt0066 dbmsBt0066 dbms
Bt0066 dbms
smumbahelp
 
Development of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrievalDevelopment of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrieval
Amjad Ali
 
G04124041046
G04124041046G04124041046
G04124041046
IOSR-JEN
 
Object oriented modeling
Object oriented modelingObject oriented modeling
Object oriented modeling
Pooja Dixit
 
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATIONONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
IJDKP
 
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
ijdmtaiir
 
Dr31564567
Dr31564567Dr31564567
Dr31564567
IJMER
 
intro.ppt
intro.pptintro.ppt
intro.ppt
UbaidURRahman78
 
A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorization
Ninad Samel
 
A scalable gibbs sampler for probabilistic entity linking
A scalable gibbs sampler for probabilistic entity linkingA scalable gibbs sampler for probabilistic entity linking
A scalable gibbs sampler for probabilistic entity linking
Sunny Kr
 
HOLISTIC EVALUATION OF XML QUERIES WITH STRUCTURAL PREFERENCES ON AN ANNOTATE...
HOLISTIC EVALUATION OF XML QUERIES WITH STRUCTURAL PREFERENCES ON AN ANNOTATE...HOLISTIC EVALUATION OF XML QUERIES WITH STRUCTURAL PREFERENCES ON AN ANNOTATE...
HOLISTIC EVALUATION OF XML QUERIES WITH STRUCTURAL PREFERENCES ON AN ANNOTATE...
ijseajournal
 
Data Structure the Basic Structure for Programming
Data Structure the Basic Structure for ProgrammingData Structure the Basic Structure for Programming
Data Structure the Basic Structure for Programming
paperpublications3
 
Text categorization
Text categorizationText categorization
Text categorization
KU Leuven
 
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...
cscpconf
 

Similar to Information Retrieval Models (20)

Vsm 벡터공간모델
Vsm 벡터공간모델Vsm 벡터공간모델
Vsm 벡터공간모델
 
Vsm 벡터공간모델
Vsm 벡터공간모델Vsm 벡터공간모델
Vsm 벡터공간모델
 
Information retrival system and PageRank algorithm
Information retrival system and PageRank algorithmInformation retrival system and PageRank algorithm
Information retrival system and PageRank algorithm
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Barzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationBarzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentation
 
LEARNING CONTEXT FOR TEXT.pdf
LEARNING CONTEXT FOR TEXT.pdfLEARNING CONTEXT FOR TEXT.pdf
LEARNING CONTEXT FOR TEXT.pdf
 
Bt0066 dbms
Bt0066 dbmsBt0066 dbms
Bt0066 dbms
 
Development of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrievalDevelopment of a new indexing technique for XML document retrieval
Development of a new indexing technique for XML document retrieval
 
G04124041046
G04124041046G04124041046
G04124041046
 
Object oriented modeling
Object oriented modelingObject oriented modeling
Object oriented modeling
 
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATIONONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
ONTOLOGY INTEGRATION APPROACHES AND ITS IMPACT ON TEXT CATEGORIZATION
 
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
Scaling Down Dimensions and Feature Extraction in Document Repository Classif...
 
Dr31564567
Dr31564567Dr31564567
Dr31564567
 
intro.ppt
intro.pptintro.ppt
intro.ppt
 
A rough set based hybrid method to text categorization
A rough set based hybrid method to text categorizationA rough set based hybrid method to text categorization
A rough set based hybrid method to text categorization
 
A scalable gibbs sampler for probabilistic entity linking
A scalable gibbs sampler for probabilistic entity linkingA scalable gibbs sampler for probabilistic entity linking
A scalable gibbs sampler for probabilistic entity linking
 
HOLISTIC EVALUATION OF XML QUERIES WITH STRUCTURAL PREFERENCES ON AN ANNOTATE...
HOLISTIC EVALUATION OF XML QUERIES WITH STRUCTURAL PREFERENCES ON AN ANNOTATE...HOLISTIC EVALUATION OF XML QUERIES WITH STRUCTURAL PREFERENCES ON AN ANNOTATE...
HOLISTIC EVALUATION OF XML QUERIES WITH STRUCTURAL PREFERENCES ON AN ANNOTATE...
 
Data Structure the Basic Structure for Programming
Data Structure the Basic Structure for ProgrammingData Structure the Basic Structure for Programming
Data Structure the Basic Structure for Programming
 
Text categorization
Text categorizationText categorization
Text categorization
 
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...
 

Recently uploaded

Non-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech ProfessionalsNon-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech Professionals
MattVassar1
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
heathfieldcps1
 
Slides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptxSlides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptx
shabeluno
 
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
220711130100 udita Chakraborty  Aims and objectives of national policy on inf...220711130100 udita Chakraborty  Aims and objectives of national policy on inf...
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
Kalna College
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
ShwetaGawande8
 
What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17
Celine George
 
(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"
MJDuyan
 
How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...
Infosec
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
Kalna College
 
Interprofessional Education Platform Introduction.pdf
Interprofessional Education Platform Introduction.pdfInterprofessional Education Platform Introduction.pdf
Interprofessional Education Platform Introduction.pdf
Ben Aldrich
 
Creating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptxCreating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptx
Forum of Blended Learning
 
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
yarusun
 
Information and Communication Technology in Education
Information and Communication Technology in EducationInformation and Communication Technology in Education
Information and Communication Technology in Education
MJDuyan
 
managing Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptxmanaging Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptx
nabaegha
 
Keynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse CityKeynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse City
PJ Caposey
 
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
biruktesfaye27
 
Creativity for Innovation and Speechmaking
Creativity for Innovation and SpeechmakingCreativity for Innovation and Speechmaking
Creativity for Innovation and Speechmaking
MattVassar1
 
220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx
Kalna College
 
Opportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive themOpportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive them
EducationNC
 
nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...
chaudharyreet2244
 

Recently uploaded (20)

Non-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech ProfessionalsNon-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech Professionals
 
The basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptxThe basics of sentences session 8pptx.pptx
The basics of sentences session 8pptx.pptx
 
Slides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptxSlides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptx
 
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
220711130100 udita Chakraborty  Aims and objectives of national policy on inf...220711130100 udita Chakraborty  Aims and objectives of national policy on inf...
220711130100 udita Chakraborty Aims and objectives of national policy on inf...
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
 
What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17
 
(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"(T.L.E.) Agriculture: "Ornamental Plants"
(T.L.E.) Agriculture: "Ornamental Plants"
 
How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...How to stay relevant as a cyber professional: Skills, trends and career paths...
How to stay relevant as a cyber professional: Skills, trends and career paths...
 
220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science220711130082 Srabanti Bag Internet Resources For Natural Science
220711130082 Srabanti Bag Internet Resources For Natural Science
 
Interprofessional Education Platform Introduction.pdf
Interprofessional Education Platform Introduction.pdfInterprofessional Education Platform Introduction.pdf
Interprofessional Education Platform Introduction.pdf
 
Creating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptxCreating Images and Videos through AI.pptx
Creating Images and Videos through AI.pptx
 
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
Get Success with the Latest UiPath UIPATH-ADPV1 Exam Dumps (V11.02) 2024
 
Information and Communication Technology in Education
Information and Communication Technology in EducationInformation and Communication Technology in Education
Information and Communication Technology in Education
 
managing Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptxmanaging Behaviour in early childhood education.pptx
managing Behaviour in early childhood education.pptx
 
Keynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse CityKeynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse City
 
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
Ethiopia and Eritrea Eritrea's journey has been marked by resilience and dete...
 
Creativity for Innovation and Speechmaking
Creativity for Innovation and SpeechmakingCreativity for Innovation and Speechmaking
Creativity for Innovation and Speechmaking
 
220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx220711130088 Sumi Basak Virtual University EPC 3.pptx
220711130088 Sumi Basak Virtual University EPC 3.pptx
 
Opportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive themOpportunity scholarships and the schools that receive them
Opportunity scholarships and the schools that receive them
 
nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...
 

Information Retrieval Models

  • 1. Chapter 2 Modeling 資工4B 86075800 陳建勳
  • 2. Introduction.  Traditional information retrieval systems usually adopt index terms to index and retrieve documents.  An index term is a keyword(or group of related words) which has some meaning of its own (usually a noun).
  • 3. The advantage of using index terms Simple The semantic of the documents and of the user information need can be naturally expressed through sets of index terms.  Ranking algorithms are at the core of information retrieval systems(predicting which documents are relevant and which are not).
  • 4. A taxonomy of information retrieval models Retrieval: Ad hoc Filtering Classic Models Browsing U S E R T A S K Boolean Vector Probabilistic Structured Models Non-overlapping lists Proximal Nodes Flat Structured Guided Hypertext Browsing Fuzzy Extended Boolean Set Theoretic Algebraic Generalized Vector Lat. Semantic Index Neural Networks Inference Network Belief Network Probabilistic
  • 5. Index Terms Full Text Full Text+ Structure Retrieval Classic Set Theoretic Algebraic Probabilistic Classic Set Theoretic Algebraic Probabilistic Structured Browsing Flat Flat Hypertext Structure Guided Hypertext Figure 2.2 Retrieval models most frequently associated with distinct combinations of a document logical view and a user task.
  • 6. Retrieval : Ad hoc and Filtering Ad hoc : The documents in the collection remain relatively static while new queries are submtted to the system. Filtering : The queries remain relatively static while new documents come into the system
  • 7. Filtering Typically, the filtering task simply indicates to the user the documents which might be of interest to him. Routing : Rank the filtering documents and show this ranking to the user. Constructing user profiles in two ways.
  • 8. A formal characterization of IR models D : A set composed of logical views(or representation) for the documents in the collection. Q : A set composed of logical views(or representation) for the user information needs(queries). F : A framework for modeling document representations, queries, and their relationships. R(qi, dj) : A ranking function which defines an ordering among the documents with regard to the query.
  • 9. Classic information retrieval model Basic concepts : Each document is described by a set of representative keywords called index terms. Assign a numerical weights to distinct relevance between index terms.
  • 10. Define ki : A generic index term K : The set of all index terms {k1,…,kt} wi,j : A weight associated with index term ki of a document dj gi : A function returns the weight associated with ki in any t-dimensoinal vector( gi(dj)=wi,j )
  • 11. Boolean model Based on a binary decision criterion without any notion of a grading scale. Boolean expressions have precise semantics.It is not simple to translate an information need into a Boolean expression. Can be represented as a disjunction of conjunction vectors(in disjunctive normal form- DNF).
  • 12. Vector model Assign non-binary weights to index terms in queries and in documents. Compute the similarity between documents and query. More precise than Boolean model.
  • 13. 想法 We think of the documents as a collection C of objects and think of the user query as a specification of a set A of objects.In this scenario, the IR problem can be reduced to the problem of determine which documents are in the set A and which ones are not(i.e., the IR problem can be viewed as a clustering problem).
  • 14. Intra-cluster : One needs to determine what are the features which better describe the objects in the set A. Inter-cluster : One needs to determine what are the features which better distinguish the objects in the set A.
  • 15. tf : inter-clustering similarity is quantified by measuring the raw frequency of a term ki inside a document dj, such term frequency is usually referred to as the tf factor and provides one measure of how well that term describes the document contents. idf : inter-clustering similarity is quantified by measuring the inverse of the frequency of a term ki among the documents in the collection.This frequency is often referred to as the inverse document frequency.
  • 16. Vector model is simple and fast. It’s a popular retrieval model. Disadvantage : Index terms are assumed to be mutually independent. It doesn’t account for index term dependencies.
  • 17. Probabilistic model  We can think of the querying process as a process of specifying the properties of an ideal answer set(The problem is that we do not know exactly what these properties are.).
  • 18. Structured text retrieval model  Retrieval models which combine information on text content with information on the document structure are called structured text retrieval model.  Match point : refer to the position in the text of a sequence of words which matches the user query.  Region : refer to a contiguous portion of the text.  Node : refer to a structural component of the document such as a chapter, a section, a subsection.
  • 19. Model based on Non-overlapping lists Divide the whole text of each document in non-overlapping text regions which are collected in a list. Text regions in the same list have no overlapping, but text regions from distinct lists might overlap.
  • 20. Model based on Proximal nodes A model which allows the definition of independent hierarchical indexing structures over the same document text. Each of these index structures is a strict hierarchy composed of chapters, sections, paragraphs, pages, and lines which called nodes.
  • 21. Models for browsing Flat browsing Structure guided browsing The hypertext model
  • 22. Flat browsing The documents might be represented as dots in a plan or as elements in a list. Relevance feedback Disadvantage : In a given page or screen there may not be any indication about the context where the user is.
  • 23. Structure guided browsing Organized in a directory structure. It groups documents covering related topics. The same idea can be applied to a single document. Using history map.
  • 24. The hypertext model Written text is usually conceived to be read sequentially. The reader should not expect to fully understand the message conveyed by the writer by randomly reading pieces of text here and there.
  翻译: