尊敬的 微信汇率:1円 ≈ 0.046166 元 支付宝汇率:1円 ≈ 0.046257元 [退出登录]
SlideShare a Scribd company logo
Search engines
7/2/2019 Compiled by: Kamal Acharya 1
7/2/2019 Compiled by: Kamal Acharya 2
Introduction
• Most libraries have a relatively small collection of documents and a
catalogue to search documents.
• The web on the other hand is a very large collection of documents.
• The search engines allow a user to carry out the task of searching the
web for information.
• The Google dominates the search engine market(about 75%).
7/2/2019 Compiled by: Kamal Acharya 3
Differences between web search and information retrieval
• A web search is very different than a normal information
retrieval search of a document because of the following factors:
– Bulk:
• web is much larger than any set of documents used by IR.
– Diversity:
• web pages may contains text, tables, image, video, audio
instead of only text.
– Growth:
• exponential growth of web.
7/2/2019 Compiled by: Kamal Acharya 4
Contd..
– Dynamic:
• web changes significantly with time but text does not.
– Demanding Users:
• users demand immediate results.
– Quality of document:
• Text documents are usually of high quality but web documents may not.
– Hyperlinks:
• very important components of web documents
– Queries:
• web queries are short and ambiguous.
7/2/2019 Compiled by: Kamal Acharya 5
Search engine
• A search engine is defined as program that searches for
documents for specified keyword and returns a list of the
documents where the keywords are found.
• A search engine consists of the following main components:
– Crawler(spider)
– Indexer
– Search engine user interface
7/2/2019 Compiled by: Kamal Acharya 6
Contd…
• A typical search engine architecture is as shown in figure below
7/2/2019 Compiled by: Kamal Acharya 7
Contd..
• How search engine works?
A search engine operates in the following order:
1. Web crawling
2. Indexing
3. Searching
7/2/2019 Compiled by: Kamal Acharya 8
Contd..
• Web Crawling:
– Search engine has a huge databases of web pages . Such databases
are built and updated automatically by the web crawler.
– The web crawler performs web crawling as follows:
• The crawler begins with one or more URLs that constitute a
URL set.
• It picks a URL from this URL set, and then fetches the web
page at that URL.
• The fetched page is then parsed to extract both the text and the
links from the page.
• The extracted links (URLs) are then added to a URL set.
• The extracted text is fed to a text indexer.
7/2/2019 Compiled by: Kamal Acharya 9
Contd..
• Indexing:
– The indexer module of the search engine is responsible for
indexing the extracted text supplied by the web crawler.
– Most commonly used indexing is the inverted indexing
7/2/2019 Compiled by: Kamal Acharya 10
Contd..
7/2/2019 Compiled by: Kamal Acharya 11
Contd..
• Searching:
– When a user enters a query to the search engine, user is not
searching the entire web. Instead user is only searching the
database that has been compiled by the search engine.
– The user’s query is parsed into the words by the query parser.
– Such parsed words are matched with the words in the inverted
list of indexed documents.
– The matched list of documents are returned to the user with
ranking.
7/2/2019 Compiled by: Kamal Acharya 12
Characteristics of search engines
• Features a search engine must provide:
– Robustness:
• search engine must be distributed over large number of
machine to deal search engine failure due to the machine
failure.
– Politeness:
• Web servers have policies regulating the rate at which a search
engine can visit them. These politeness policies must be
respected.
7/2/2019 Compiled by: Kamal Acharya 13
Contd..
• Features a search engine should provide
– Distributed:
• The search should have the ability to execute in a
distributed fashion across multiple machines.
– Scalable:
• The search engine architecture should permit scaling up
the search rate by adding extra machines.
7/2/2019 Compiled by: Kamal Acharya 14
Contd..
• Performance and efficiency:
– The search system should make efficient use of various
system resources including processor, storage and network.
• Quality:
– Given that a significant fraction of all web pages are of poor utility
for serving user query needs, the search engine should be biased
towards fetching “useful” pages first.
7/2/2019 Compiled by: Kamal Acharya 15
Contd..
• Freshness:
– it should obtain fresh copies of previously fetched pages.
• Extensible:
– Crawlers should be designed to be extensible in many ways – to
cope with new data formats, new fetch protocols, and so on. This
demands that the crawler architecture be modular.
7/2/2019 Compiled by: Kamal Acharya 16
Problems with search using search engines
• Specifying query keywords can be challenging:
– Search result get affected by structure of the query phrase.
– Due to the nature of English language search result may get
affected e.g., current.
7/2/2019 Compiled by: Kamal Acharya 17
Contd..
• Difficult for the search engine to be certain about what users
want.
– Some may be seeking destination While others may want only
a small number of highly relevant result.
• Diversity of search engine and web users
– Young to old
– A search engine is therefore attempting to meet the needs of a
diverse group of users.
7/2/2019 Compiled by: Kamal Acharya 18
The goals of web search
• Depending on the nature of search engine queries, the
information needs of user may be divided into three classes:
– Navigational
– Informational
– Transactional
7/2/2019 Compiled by: Kamal Acharya 19
Contd..
– Navigational:
• To reach a website that the user has in mind. The user may
know the site exists but or may have visited the site earlier but
does not know the site URL.
– Informational:
• To find a website that provides useful information about a topic
of interest. The user may not have a particular website in mind.
– Transactional:
• To go to a site to perform some kind of transaction. E.g., buy a
book
7/2/2019 Compiled by: Kamal Acharya 20
Quality of search result
• The quality of search results from a search engine ideally should
satisfy the following requirements:
– Precision:
• precision indicates what percentage of documents retrieved are
relevant?
• So , only relevant documents should be returned.
– Recall:
• means what percentage of relevant documents is retrieved from
total relevant documents in the web
• So, all relevant document should be returned
7/2/2019 Compiled by: Kamal Acharya 21
Contd..
• Ranking:
– A ranking of the documents providing some indication of the
relative relevance of the results should be returned.
• First screen:
– The first page of results should include the most relevant
results.
• Speed:
– Results should be provided quickly.
7/2/2019 Compiled by: Kamal Acharya 22
Search engine functionality
• A search engine is a complex collection of software modules. A
search engines carries out a variety of tasks:
– Collecting information
– Evaluating and categorizing information
– Creating a database and creating indexes
– Computing ranks of the web documents
– Checking queries and executing them
– Presenting results
– Profiling the users
7/2/2019 Compiled by: Kamal Acharya 23
Contd..
• Collecting information:
– A search engine would normally collect web pages or information
about them by web crawling.
• Evaluating and categorizing information:
– search engine evaluates the pages before submission and categorize
the information.
• Creating a database and creating indexes:
– The information collected needs to be stored either in a database
or some kind of file system. Indexes must be created so that the
information may be searched efficiently.
7/2/2019 Compiled by: Kamal Acharya 24
Contd..
• Computing ranks of the web documents: rank the web pages before
returning as a response to the user queries.
• Checking queries and executing them: queries posed by users need
to be checked , for example, for spelling errors. Once checked, a query
is executed by searching the search engine database.
• Presenting results: search engine determine what results to present
and how to display them
• Profiling the user: To improve the search performance the search
engines carry out user profiling that deals with the way users use
search engines.
7/2/2019 Compiled by: Kamal Acharya 25
Page Ranking
• The web consists of a huge number of documents that have been
published without any quality control.
• The page ranking is a method for determining the relative
importance and quality of the page for a given query.
• The most well known ranking algorithm is the page rank
algorithm.
7/2/2019 Compiled by: Kamal Acharya 26
Contd..
• Page rank algorithm:
– Was developed by Larry page at Stanford university.
– A hyperlink to a page counts as a vote of support .
• A page that is linked to by many pages receives a high rank and if
there is no links to a web page there is no support for that page so, get
low rank.
7/2/2019 Compiled by: Kamal Acharya 27
Contd..
– Assigns to every node in the web graph a numerical score
between 0 and 1 to each element of hyperlinked set of
documents.
– The rank value indicates the importance of a particular page.
• A page rank of 0.5 means there is a 50% chance that a person clicking
on a random link will be directed to the document with 0.5 page rank.
7/2/2019 Compiled by: Kamal Acharya 28
Contd..
• Algorithm with illustrative example:
– Assume a small universe of four web pages A, B, C and D.
– The initial approximation of Page Rank would be evenly divided between
the four documents.
– Hence each document would begin with an estimated Page Rank of 0.25.
– If pages B, C and D each only link to A, they would each confer 0.25 page
rank to A.
– i.e. PR(A) = PR(B) + PR(C) + PR(D) = 0.75
7/2/2019 Compiled by: Kamal Acharya 29
Contd..
• Suppose that page B has link to page C as well as to page A, while pages D has
links to all three pages and page C has link to A.
• The value of link votes is divided among all the outbound links on the page.
• Thus B gives vote worth 0.125 to page A and a vote 0.125 to page C.
• Similarly, D’s page rank is 0.083 (approximately)
• i.e. PR(A) = PR(B)/2 + PR(C)/1 + PR(D)/3
• Where L(page) = number of outbound links
• Bm = set of all pages link to page m
7/2/2019 Compiled by: Kamal Acharya 30
Home work
• What is search engine? Explain the various components of search engine
architecture.
• What is the role of crawler and indexer?
• Explain the different search engine functionality.
• What are the primary goals of web search?
• Describe the page rank algorithm. Using an example, show how it works.
• How is web search different than text retrieval?
Thank You !
Compiled by: Kamal Acharya 317/2/2019

More Related Content

What's hot

The semantic web
The semantic web The semantic web
The semantic web
ap
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
Roi Blanco
 
Routing algorithm
Routing algorithmRouting algorithm
Routing algorithm
Bushra M
 
Web crawler
Web crawlerWeb crawler
Web crawler
poonamkenkre
 
Hypertext transfer protocol (http)
Hypertext transfer protocol (http)Hypertext transfer protocol (http)
Hypertext transfer protocol (http)
Shimona Agarwal
 
HTTP request and response
HTTP request and responseHTTP request and response
HTTP request and response
Sahil Agarwal
 
Chapter One.ppt
Chapter One.pptChapter One.ppt
Chapter One.ppt
abdigeremew
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
sumitjain2013
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on ir
Primya Tamil
 
Web search vs ir
Web search vs irWeb search vs ir
Web search vs ir
Primya Tamil
 
Lecture 3 threads
Lecture 3   threadsLecture 3   threads
Lecture 3 threads
Kumbirai Junior Muzavazi
 
Metadata ppt
Metadata pptMetadata ppt
Metadata ppt
Shashikant Kumar
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawler
vinay arora
 
Web Design Notes
Web Design NotesWeb Design Notes
Web Design Notes
butest
 
directory structure and file system mounting
directory structure and file system mountingdirectory structure and file system mounting
directory structure and file system mounting
rajshreemuthiah
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
ZHAO Sam
 
Client server architecture
Client server architectureClient server architecture
Client server architecture
Whitireia New Zealand
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems
Dr Sandeep Kumar Poonia
 
Semantic Web
Semantic WebSemantic Web
Semantic Web
Adarsh Kumar Yadav
 
Security in distributed systems
Security in distributed systems Security in distributed systems
Security in distributed systems
Haitham Ahmed
 

What's hot (20)

The semantic web
The semantic web The semantic web
The semantic web
 
Introduction to Information Retrieval
Introduction to Information RetrievalIntroduction to Information Retrieval
Introduction to Information Retrieval
 
Routing algorithm
Routing algorithmRouting algorithm
Routing algorithm
 
Web crawler
Web crawlerWeb crawler
Web crawler
 
Hypertext transfer protocol (http)
Hypertext transfer protocol (http)Hypertext transfer protocol (http)
Hypertext transfer protocol (http)
 
HTTP request and response
HTTP request and responseHTTP request and response
HTTP request and response
 
Chapter One.ppt
Chapter One.pptChapter One.ppt
Chapter One.ppt
 
Fault tolerance in distributed systems
Fault tolerance in distributed systemsFault tolerance in distributed systems
Fault tolerance in distributed systems
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on ir
 
Web search vs ir
Web search vs irWeb search vs ir
Web search vs ir
 
Lecture 3 threads
Lecture 3   threadsLecture 3   threads
Lecture 3 threads
 
Metadata ppt
Metadata pptMetadata ppt
Metadata ppt
 
Search engine and web crawler
Search engine and web crawlerSearch engine and web crawler
Search engine and web crawler
 
Web Design Notes
Web Design NotesWeb Design Notes
Web Design Notes
 
directory structure and file system mounting
directory structure and file system mountingdirectory structure and file system mounting
directory structure and file system mounting
 
Clustering: Large Databases in data mining
Clustering: Large Databases in data miningClustering: Large Databases in data mining
Clustering: Large Databases in data mining
 
Client server architecture
Client server architectureClient server architecture
Client server architecture
 
8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems8. mutual exclusion in Distributed Operating Systems
8. mutual exclusion in Distributed Operating Systems
 
Semantic Web
Semantic WebSemantic Web
Semantic Web
 
Security in distributed systems
Security in distributed systems Security in distributed systems
Security in distributed systems
 

Similar to Search Engines

Web Mining
Web MiningWeb Mining
Web Mining
Kamal Acharya
 
Link analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank AlgorithmLink analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank Algorithm
Kavita Kushwah
 
Web mining
Web miningWeb mining
Web mining
SwarnaLatha177
 
Search engine
Search engineSearch engine
Search engine
Rishabh Agarwal
 
E3602042044
E3602042044E3602042044
E3602042044
ijceronline
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slides
mahavir_a
 
Why we need an independent index of the Web
Why we need an independent index of the WebWhy we need an independent index of the Web
Why we need an independent index of the Web
Dirk Lewandowski
 
IRT Unit_4.pptx
IRT Unit_4.pptxIRT Unit_4.pptx
IRT Unit_4.pptx
thenmozhip8
 
Web Mining.pptx
Web Mining.pptxWeb Mining.pptx
Web Mining.pptx
ScrbifPt
 
The Research on Related Technologies of Web Crawler
The Research on Related Technologies of Web CrawlerThe Research on Related Technologies of Web Crawler
The Research on Related Technologies of Web Crawler
IRJESJOURNAL
 
Search Engine Optimization (SEO)
Search Engine Optimization (SEO)Search Engine Optimization (SEO)
Search Engine Optimization (SEO)
Nandu B Rajan
 
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Ayca Turhan
 
SharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 SearchSharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 Search
C/D/H Technology Consultants
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
butest
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
butest
 
Design Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewDesign Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A Review
IOSR Journals
 
Web mining
Web miningWeb mining
Web mining
Jay Lohokare
 
International conference On Computer Science And technology
International conference On Computer Science And technologyInternational conference On Computer Science And technology
International conference On Computer Science And technology
anchalsinghdm
 
Presentation SEO.pptx
Presentation SEO.pptxPresentation SEO.pptx
Presentation SEO.pptx
DavidAnderson825814
 
WEB MINING.
WEB MINING.WEB MINING.
WEB MINING.
Sushil kasar
 

Similar to Search Engines (20)

Web Mining
Web MiningWeb Mining
Web Mining
 
Link analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank AlgorithmLink analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank Algorithm
 
Web mining
Web miningWeb mining
Web mining
 
Search engine
Search engineSearch engine
Search engine
 
E3602042044
E3602042044E3602042044
E3602042044
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slides
 
Why we need an independent index of the Web
Why we need an independent index of the WebWhy we need an independent index of the Web
Why we need an independent index of the Web
 
IRT Unit_4.pptx
IRT Unit_4.pptxIRT Unit_4.pptx
IRT Unit_4.pptx
 
Web Mining.pptx
Web Mining.pptxWeb Mining.pptx
Web Mining.pptx
 
The Research on Related Technologies of Web Crawler
The Research on Related Technologies of Web CrawlerThe Research on Related Technologies of Web Crawler
The Research on Related Technologies of Web Crawler
 
Search Engine Optimization (SEO)
Search Engine Optimization (SEO)Search Engine Optimization (SEO)
Search Engine Optimization (SEO)
 
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)Digital Marketing Course Week 6: Search Engine Optimization (SEO)
Digital Marketing Course Week 6: Search Engine Optimization (SEO)
 
SharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 SearchSharePoint User Group Meeting- SharePoint 2013 Search
SharePoint User Group Meeting- SharePoint 2013 Search
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
 
Design Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A ReviewDesign Issues for Search Engines and Web Crawlers: A Review
Design Issues for Search Engines and Web Crawlers: A Review
 
Web mining
Web miningWeb mining
Web mining
 
International conference On Computer Science And technology
International conference On Computer Science And technologyInternational conference On Computer Science And technology
International conference On Computer Science And technology
 
Presentation SEO.pptx
Presentation SEO.pptxPresentation SEO.pptx
Presentation SEO.pptx
 
WEB MINING.
WEB MINING.WEB MINING.
WEB MINING.
 

More from Kamal Acharya

Programming the basic computer
Programming the basic computerProgramming the basic computer
Programming the basic computer
Kamal Acharya
 
Computer Arithmetic
Computer ArithmeticComputer Arithmetic
Computer Arithmetic
Kamal Acharya
 
Introduction to Computer Security
Introduction to Computer SecurityIntroduction to Computer Security
Introduction to Computer Security
Kamal Acharya
 
Session and Cookies
Session and CookiesSession and Cookies
Session and Cookies
Kamal Acharya
 
Functions in php
Functions in phpFunctions in php
Functions in php
Kamal Acharya
 
Web forms in php
Web forms in phpWeb forms in php
Web forms in php
Kamal Acharya
 
Making decision and repeating in PHP
Making decision and repeating  in PHPMaking decision and repeating  in PHP
Making decision and repeating in PHP
Kamal Acharya
 
Working with arrays in php
Working with arrays in phpWorking with arrays in php
Working with arrays in php
Kamal Acharya
 
Text and Numbers (Data Types)in PHP
Text and Numbers (Data Types)in PHPText and Numbers (Data Types)in PHP
Text and Numbers (Data Types)in PHP
Kamal Acharya
 
Introduction to PHP
Introduction to PHPIntroduction to PHP
Introduction to PHP
Kamal Acharya
 
Capacity Planning of Data Warehousing
Capacity Planning of Data WarehousingCapacity Planning of Data Warehousing
Capacity Planning of Data Warehousing
Kamal Acharya
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
Kamal Acharya
 
Information Privacy and Data Mining
Information Privacy and Data MiningInformation Privacy and Data Mining
Information Privacy and Data Mining
Kamal Acharya
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
Kamal Acharya
 
Association Analysis in Data Mining
Association Analysis in Data MiningAssociation Analysis in Data Mining
Association Analysis in Data Mining
Kamal Acharya
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
Kamal Acharya
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
Kamal Acharya
 
Introduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingIntroduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data Warehousing
Kamal Acharya
 
Functions in Python
Functions in PythonFunctions in Python
Functions in Python
Kamal Acharya
 
Python Flow Control
Python Flow ControlPython Flow Control
Python Flow Control
Kamal Acharya
 

More from Kamal Acharya (20)

Programming the basic computer
Programming the basic computerProgramming the basic computer
Programming the basic computer
 
Computer Arithmetic
Computer ArithmeticComputer Arithmetic
Computer Arithmetic
 
Introduction to Computer Security
Introduction to Computer SecurityIntroduction to Computer Security
Introduction to Computer Security
 
Session and Cookies
Session and CookiesSession and Cookies
Session and Cookies
 
Functions in php
Functions in phpFunctions in php
Functions in php
 
Web forms in php
Web forms in phpWeb forms in php
Web forms in php
 
Making decision and repeating in PHP
Making decision and repeating  in PHPMaking decision and repeating  in PHP
Making decision and repeating in PHP
 
Working with arrays in php
Working with arrays in phpWorking with arrays in php
Working with arrays in php
 
Text and Numbers (Data Types)in PHP
Text and Numbers (Data Types)in PHPText and Numbers (Data Types)in PHP
Text and Numbers (Data Types)in PHP
 
Introduction to PHP
Introduction to PHPIntroduction to PHP
Introduction to PHP
 
Capacity Planning of Data Warehousing
Capacity Planning of Data WarehousingCapacity Planning of Data Warehousing
Capacity Planning of Data Warehousing
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Information Privacy and Data Mining
Information Privacy and Data MiningInformation Privacy and Data Mining
Information Privacy and Data Mining
 
Cluster Analysis
Cluster AnalysisCluster Analysis
Cluster Analysis
 
Association Analysis in Data Mining
Association Analysis in Data MiningAssociation Analysis in Data Mining
Association Analysis in Data Mining
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Introduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data WarehousingIntroduction to Data Mining and Data Warehousing
Introduction to Data Mining and Data Warehousing
 
Functions in Python
Functions in PythonFunctions in Python
Functions in Python
 
Python Flow Control
Python Flow ControlPython Flow Control
Python Flow Control
 

Recently uploaded

How to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRMHow to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRM
Celine George
 
The Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptxThe Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptx
PriyaKumari928991
 
bryophytes.pptx bsc botany honours second semester
bryophytes.pptx bsc botany honours  second semesterbryophytes.pptx bsc botany honours  second semester
bryophytes.pptx bsc botany honours second semester
Sarojini38
 
Accounting for Restricted Grants When and How To Record Properly
Accounting for Restricted Grants  When and How To Record ProperlyAccounting for Restricted Grants  When and How To Record Properly
Accounting for Restricted Grants When and How To Record Properly
TechSoup
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
ShwetaGawande8
 
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptxScience-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Catherine Dela Cruz
 
Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024
khabri85
 
Slides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptxSlides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptx
shabeluno
 
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapitolTechU
 
nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...
chaudharyreet2244
 
How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17
Celine George
 
pol sci Election and Representation Class 11 Notes.pdf
pol sci Election and Representation Class 11 Notes.pdfpol sci Election and Representation Class 11 Notes.pdf
pol sci Election and Representation Class 11 Notes.pdf
BiplabHalder13
 
Talking Tech through Compelling Visual Aids
Talking Tech through Compelling Visual AidsTalking Tech through Compelling Visual Aids
Talking Tech through Compelling Visual Aids
MattVassar1
 
Keynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse CityKeynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse City
PJ Caposey
 
Creativity for Innovation and Speechmaking
Creativity for Innovation and SpeechmakingCreativity for Innovation and Speechmaking
Creativity for Innovation and Speechmaking
MattVassar1
 
What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17
Celine George
 
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
220711130083 SUBHASHREE RAKSHIT  Internet resources for social science220711130083 SUBHASHREE RAKSHIT  Internet resources for social science
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
Kalna College
 
220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology
Kalna College
 
Non-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech ProfessionalsNon-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech Professionals
MattVassar1
 
8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity
RuchiRathor2
 

Recently uploaded (20)

How to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRMHow to Create a Stage or a Pipeline in Odoo 17 CRM
How to Create a Stage or a Pipeline in Odoo 17 CRM
 
The Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptxThe Rise of the Digital Telecommunication Marketplace.pptx
The Rise of the Digital Telecommunication Marketplace.pptx
 
bryophytes.pptx bsc botany honours second semester
bryophytes.pptx bsc botany honours  second semesterbryophytes.pptx bsc botany honours  second semester
bryophytes.pptx bsc botany honours second semester
 
Accounting for Restricted Grants When and How To Record Properly
Accounting for Restricted Grants  When and How To Record ProperlyAccounting for Restricted Grants  When and How To Record Properly
Accounting for Restricted Grants When and How To Record Properly
 
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
INTRODUCTION TO HOSPITALS & AND ITS ORGANIZATION
 
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptxScience-9-Lesson-1-The Bohr Model-NLC.pptx pptx
Science-9-Lesson-1-The Bohr Model-NLC.pptx pptx
 
Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024Brand Guideline of Bashundhara A4 Paper - 2024
Brand Guideline of Bashundhara A4 Paper - 2024
 
Slides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptxSlides Peluncuran Amalan Pemakanan Sihat.pptx
Slides Peluncuran Amalan Pemakanan Sihat.pptx
 
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptx
 
nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...nutrition in plants chapter 1 class 7...
nutrition in plants chapter 1 class 7...
 
How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17How to Create User Notification in Odoo 17
How to Create User Notification in Odoo 17
 
pol sci Election and Representation Class 11 Notes.pdf
pol sci Election and Representation Class 11 Notes.pdfpol sci Election and Representation Class 11 Notes.pdf
pol sci Election and Representation Class 11 Notes.pdf
 
Talking Tech through Compelling Visual Aids
Talking Tech through Compelling Visual AidsTalking Tech through Compelling Visual Aids
Talking Tech through Compelling Visual Aids
 
Keynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse CityKeynote given on June 24 for MASSP at Grand Traverse City
Keynote given on June 24 for MASSP at Grand Traverse City
 
Creativity for Innovation and Speechmaking
Creativity for Innovation and SpeechmakingCreativity for Innovation and Speechmaking
Creativity for Innovation and Speechmaking
 
What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17What are the new features in the Fleet Odoo 17
What are the new features in the Fleet Odoo 17
 
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
220711130083 SUBHASHREE RAKSHIT  Internet resources for social science220711130083 SUBHASHREE RAKSHIT  Internet resources for social science
220711130083 SUBHASHREE RAKSHIT Internet resources for social science
 
220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology220711130097 Tulip Samanta Concept of Information and Communication Technology
220711130097 Tulip Samanta Concept of Information and Communication Technology
 
Non-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech ProfessionalsNon-Verbal Communication for Tech Professionals
Non-Verbal Communication for Tech Professionals
 
8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity8+8+8 Rule Of Time Management For Better Productivity
8+8+8 Rule Of Time Management For Better Productivity
 

Search Engines

  • 1. Search engines 7/2/2019 Compiled by: Kamal Acharya 1
  • 2. 7/2/2019 Compiled by: Kamal Acharya 2 Introduction • Most libraries have a relatively small collection of documents and a catalogue to search documents. • The web on the other hand is a very large collection of documents. • The search engines allow a user to carry out the task of searching the web for information. • The Google dominates the search engine market(about 75%).
  • 3. 7/2/2019 Compiled by: Kamal Acharya 3 Differences between web search and information retrieval • A web search is very different than a normal information retrieval search of a document because of the following factors: – Bulk: • web is much larger than any set of documents used by IR. – Diversity: • web pages may contains text, tables, image, video, audio instead of only text. – Growth: • exponential growth of web.
  • 4. 7/2/2019 Compiled by: Kamal Acharya 4 Contd.. – Dynamic: • web changes significantly with time but text does not. – Demanding Users: • users demand immediate results. – Quality of document: • Text documents are usually of high quality but web documents may not. – Hyperlinks: • very important components of web documents – Queries: • web queries are short and ambiguous.
  • 5. 7/2/2019 Compiled by: Kamal Acharya 5 Search engine • A search engine is defined as program that searches for documents for specified keyword and returns a list of the documents where the keywords are found. • A search engine consists of the following main components: – Crawler(spider) – Indexer – Search engine user interface
  • 6. 7/2/2019 Compiled by: Kamal Acharya 6 Contd… • A typical search engine architecture is as shown in figure below
  • 7. 7/2/2019 Compiled by: Kamal Acharya 7 Contd.. • How search engine works? A search engine operates in the following order: 1. Web crawling 2. Indexing 3. Searching
  • 8. 7/2/2019 Compiled by: Kamal Acharya 8 Contd.. • Web Crawling: – Search engine has a huge databases of web pages . Such databases are built and updated automatically by the web crawler. – The web crawler performs web crawling as follows: • The crawler begins with one or more URLs that constitute a URL set. • It picks a URL from this URL set, and then fetches the web page at that URL. • The fetched page is then parsed to extract both the text and the links from the page. • The extracted links (URLs) are then added to a URL set. • The extracted text is fed to a text indexer.
  • 9. 7/2/2019 Compiled by: Kamal Acharya 9 Contd.. • Indexing: – The indexer module of the search engine is responsible for indexing the extracted text supplied by the web crawler. – Most commonly used indexing is the inverted indexing
  • 10. 7/2/2019 Compiled by: Kamal Acharya 10 Contd..
  • 11. 7/2/2019 Compiled by: Kamal Acharya 11 Contd.. • Searching: – When a user enters a query to the search engine, user is not searching the entire web. Instead user is only searching the database that has been compiled by the search engine. – The user’s query is parsed into the words by the query parser. – Such parsed words are matched with the words in the inverted list of indexed documents. – The matched list of documents are returned to the user with ranking.
  • 12. 7/2/2019 Compiled by: Kamal Acharya 12 Characteristics of search engines • Features a search engine must provide: – Robustness: • search engine must be distributed over large number of machine to deal search engine failure due to the machine failure. – Politeness: • Web servers have policies regulating the rate at which a search engine can visit them. These politeness policies must be respected.
  • 13. 7/2/2019 Compiled by: Kamal Acharya 13 Contd.. • Features a search engine should provide – Distributed: • The search should have the ability to execute in a distributed fashion across multiple machines. – Scalable: • The search engine architecture should permit scaling up the search rate by adding extra machines.
  • 14. 7/2/2019 Compiled by: Kamal Acharya 14 Contd.. • Performance and efficiency: – The search system should make efficient use of various system resources including processor, storage and network. • Quality: – Given that a significant fraction of all web pages are of poor utility for serving user query needs, the search engine should be biased towards fetching “useful” pages first.
  • 15. 7/2/2019 Compiled by: Kamal Acharya 15 Contd.. • Freshness: – it should obtain fresh copies of previously fetched pages. • Extensible: – Crawlers should be designed to be extensible in many ways – to cope with new data formats, new fetch protocols, and so on. This demands that the crawler architecture be modular.
  • 16. 7/2/2019 Compiled by: Kamal Acharya 16 Problems with search using search engines • Specifying query keywords can be challenging: – Search result get affected by structure of the query phrase. – Due to the nature of English language search result may get affected e.g., current.
  • 17. 7/2/2019 Compiled by: Kamal Acharya 17 Contd.. • Difficult for the search engine to be certain about what users want. – Some may be seeking destination While others may want only a small number of highly relevant result. • Diversity of search engine and web users – Young to old – A search engine is therefore attempting to meet the needs of a diverse group of users.
  • 18. 7/2/2019 Compiled by: Kamal Acharya 18 The goals of web search • Depending on the nature of search engine queries, the information needs of user may be divided into three classes: – Navigational – Informational – Transactional
  • 19. 7/2/2019 Compiled by: Kamal Acharya 19 Contd.. – Navigational: • To reach a website that the user has in mind. The user may know the site exists but or may have visited the site earlier but does not know the site URL. – Informational: • To find a website that provides useful information about a topic of interest. The user may not have a particular website in mind. – Transactional: • To go to a site to perform some kind of transaction. E.g., buy a book
  • 20. 7/2/2019 Compiled by: Kamal Acharya 20 Quality of search result • The quality of search results from a search engine ideally should satisfy the following requirements: – Precision: • precision indicates what percentage of documents retrieved are relevant? • So , only relevant documents should be returned. – Recall: • means what percentage of relevant documents is retrieved from total relevant documents in the web • So, all relevant document should be returned
  • 21. 7/2/2019 Compiled by: Kamal Acharya 21 Contd.. • Ranking: – A ranking of the documents providing some indication of the relative relevance of the results should be returned. • First screen: – The first page of results should include the most relevant results. • Speed: – Results should be provided quickly.
  • 22. 7/2/2019 Compiled by: Kamal Acharya 22 Search engine functionality • A search engine is a complex collection of software modules. A search engines carries out a variety of tasks: – Collecting information – Evaluating and categorizing information – Creating a database and creating indexes – Computing ranks of the web documents – Checking queries and executing them – Presenting results – Profiling the users
  • 23. 7/2/2019 Compiled by: Kamal Acharya 23 Contd.. • Collecting information: – A search engine would normally collect web pages or information about them by web crawling. • Evaluating and categorizing information: – search engine evaluates the pages before submission and categorize the information. • Creating a database and creating indexes: – The information collected needs to be stored either in a database or some kind of file system. Indexes must be created so that the information may be searched efficiently.
  • 24. 7/2/2019 Compiled by: Kamal Acharya 24 Contd.. • Computing ranks of the web documents: rank the web pages before returning as a response to the user queries. • Checking queries and executing them: queries posed by users need to be checked , for example, for spelling errors. Once checked, a query is executed by searching the search engine database. • Presenting results: search engine determine what results to present and how to display them • Profiling the user: To improve the search performance the search engines carry out user profiling that deals with the way users use search engines.
  • 25. 7/2/2019 Compiled by: Kamal Acharya 25 Page Ranking • The web consists of a huge number of documents that have been published without any quality control. • The page ranking is a method for determining the relative importance and quality of the page for a given query. • The most well known ranking algorithm is the page rank algorithm.
  • 26. 7/2/2019 Compiled by: Kamal Acharya 26 Contd.. • Page rank algorithm: – Was developed by Larry page at Stanford university. – A hyperlink to a page counts as a vote of support . • A page that is linked to by many pages receives a high rank and if there is no links to a web page there is no support for that page so, get low rank.
  • 27. 7/2/2019 Compiled by: Kamal Acharya 27 Contd.. – Assigns to every node in the web graph a numerical score between 0 and 1 to each element of hyperlinked set of documents. – The rank value indicates the importance of a particular page. • A page rank of 0.5 means there is a 50% chance that a person clicking on a random link will be directed to the document with 0.5 page rank.
  • 28. 7/2/2019 Compiled by: Kamal Acharya 28 Contd.. • Algorithm with illustrative example: – Assume a small universe of four web pages A, B, C and D. – The initial approximation of Page Rank would be evenly divided between the four documents. – Hence each document would begin with an estimated Page Rank of 0.25. – If pages B, C and D each only link to A, they would each confer 0.25 page rank to A. – i.e. PR(A) = PR(B) + PR(C) + PR(D) = 0.75
  • 29. 7/2/2019 Compiled by: Kamal Acharya 29 Contd.. • Suppose that page B has link to page C as well as to page A, while pages D has links to all three pages and page C has link to A. • The value of link votes is divided among all the outbound links on the page. • Thus B gives vote worth 0.125 to page A and a vote 0.125 to page C. • Similarly, D’s page rank is 0.083 (approximately) • i.e. PR(A) = PR(B)/2 + PR(C)/1 + PR(D)/3 • Where L(page) = number of outbound links • Bm = set of all pages link to page m
  • 30. 7/2/2019 Compiled by: Kamal Acharya 30 Home work • What is search engine? Explain the various components of search engine architecture. • What is the role of crawler and indexer? • Explain the different search engine functionality. • What are the primary goals of web search? • Describe the page rank algorithm. Using an example, show how it works. • How is web search different than text retrieval?
  • 31. Thank You ! Compiled by: Kamal Acharya 317/2/2019
  翻译: