Reviews

Prospects for the development of search engines. Internet search engines. How often do search engines change their algorithms?

17.05.2021

KOVROV STATE TECHNOLOGICAL ACADEMY

Information and analytical information on informatics

on the topic: “Modern search engines, development trends of one of the Yandex market leaders”.

Completed by: 1st year student

3 academic groups

Makarov Ivan

Introduction. 3

Main part. four

Conclusion. eleven

Introduction.

Yandex is a Russian IT company that owns a search engine of the same name on the Web and an Internet portal. The Yandex search engine is the eighth largest search site in the world in terms of the number of processed search queries (1.290 billion, statistics for August 2009) and the second largest non-English search server after the Chinese Baidu.

The company's website was opened on September 23, 1997. 2000 is the year of the formation of Yandex. Yandex was founded by CompTek (the company that developed the Yandex search engine and supported it). The company reached self-sufficiency in 2002, turnover for 2006 - 72.6 million dollars, net profit - 29.9 million, for 2005 - 35.6 million dollars, net profit - 13.6 million.

The main and priority direction of the company is the development of a search engine, but over the years Yandex has become a multi-portal. In 2009, Yandex has more than 30 services. The most popular are: Yandex.News, Yandex.Fotki, Yandex.Toys and others.

The main office of the company is located in Moscow. The company has offices in St. Petersburg, Yekaterinburg, Odessa, Simferopol and Kyiv. In mid-June 2008, the company announced the opening of Yandex Labs - an office in the US, California.

Main part.

History of the company.

The Yandex.Ru search engine was officially announced on September 23, 1997 at the Softool exhibition. The main distinguishing features of Yandex.Ru at that time were checking the uniqueness of documents (excluding copies in different encodings), as well as the key properties of the Yandex search engine, namely: taking into account the morphology of the Russian language (including searching for the exact word form), searching taking into account distances (including within a paragraph, the exact phrase), and a carefully developed algorithm for assessing relevance (correspondence of the answer to the request), taking into account not only the number of query words found in the text, but also the "contrast" of the word (its relative frequency for this document) , spacing between words, and the position of the word in the document.

A little later, in the section "Tales" (observations on the content of the Russian Internet), the first tale of the Runet appeared - "Web - humanism or chernukha?". And in the "Numbers" section - the first estimate of the volume of the Runet, 5 thousand servers and 4 GB of texts.

Two months later, in November 1997, a natural language query was implemented. From now on, Yandex.Ru can be accessed simply “in Russian”, asking long queries, for example: “where to buy a computer”, “genetically modified products” or “international telephone connection and get accurate answers. The average length of a query in Yandex.Ru is now 2.7 words. In 1997, it was 1.2 words, when search engine users were accustomed to telegraphic style.

In 1998, Yandex.Ru introduced the ability to “find a similar document”, a list of found servers, search within a given date range, and sorting search results by last modified time. During this year, the "volume" of the Russian Internet has doubled, which led to the need to optimize search engines. Both then and now (with a volume of 200 GB), the search speed on Yandex.Ru is a fraction of a second.

During 1999, the Runet grew by an order of magnitude, both in the volume of texts and in the number of users. It was a year of rapid development for Yandex.Ru as well. The new search robot made it possible to optimize and speed up the bypass of Runet sites. Today, the Yandex.Ru search base is twice as large as that of its closest competitors.

The new robot made it possible to provide users with new features - search in different text areas (headings, links, annotations, addresses, captions for pictures), limiting the search to a group of sites, searching for links and images, and highlighting documents in Russian. There was a search in the categories of the catalog and for the first time in Runet the concept of "citation index" was introduced - the number of resources that refer to this one.

Throughout the year, work continued on the quantitative and qualitative analysis of the Runet. The NINI-index was opened (index "Inconsistency of Interests of the Population of the Internet"), showing the dynamics of changes in the interests of Internet users. A search Forum and a new service have been opened - a subscription to a request, that is, you can leave your request on Yandex.Ru and regularly receive information by e-mail about the appearance of new and / or modified documents corresponding to this request. By the beginning of the school year, the "Family Yandex" was opened, filtering search results from obscene language and pornography.

The origin of the word "Yandex".

Today "Yandex" is a word from the everyday life of an Internet user. It is often found on the Web “What, Yandex has already been canceled?”, “Loneliness is when Yandex is the first to congratulate you on your birthday”, “All questions to Yandex”. Many already think that this has always been the case. In a way, this is true - Yandex really appeared simultaneously with the mass Internet, when access to the network ceased to be the lot of selected technical specialists. But the very word "Yandex" is artificial, has its own authors and its own history.

In 1993, Arkady Volozh, the future CEO of the future Yandex company, and Ilya Segalovich, the company's future technology director, developed, as it turned out later, the main technology - the search for unstructured information, taking into account the Russian language.

The development had to be named somehow. Ilya remembers how he wrote down different derivatives of words describing the meaning of technology in a column. It quickly became clear that search (“search”) in Russian sounds too dissonant and you can’t make a successful combination based on it. The word index was more appropriate. So yandex appeared in the list of names - yet another indexer ("another indexer" or Language index). Both Ilya and Arkady liked the option - it is easy to pronounce, easy to write. In addition, Arkady suggested the letter "I" in the name - specifically Russian - Russian and leave it for clarity. So the word "Yandex" was invented. And the program file, respectively, was called yandex.exe.

In 1996, when search was offered to the general public for the first time as a technology, and not as part of a content product (before that, there were the International Classifier of Inventions and the Bible Computer Reference), the line of programs was called Yandex and this name was explained as Language iNDEX. The first programs in the line were Yandex.Site (search on one of your own sites - this product is now called Yandex.Server) and Yandex.Dict (morphological prefix for AltaVista, the only search engine that at that time knew how to somehow work with Cyrillic) .

But, of course, the word "Yandex" has become widespread since September 1997, after the launch of the search engine www.yandex.ru. Since then, users of the system have been offering us their interpretations. For example, Tyoma Lebedev, preparing to draw the first version home page Yandex website, said: “Ah, I understand, if the first “I” in the word index is translated into Russian, it will be “I”, that is, this will be “Yandex”. The authors honestly admitted that they did not think about it, but - a good interpretation, is accepted. Then someone on the Web suggested another option, seeing the two sides of the Internet, INdex and YANDEX. This word has already appeared derivatives, for example, Yandex employees are often called "Yandexoids" and less often - "Yandexians".

Search "Yandex".

Yandex search allows you to search the Runet, Uanet, and Kaznet (since October 14, 2009) for documents in Russian, Ukrainian, Belarusian, Romanian, English, German and French, taking into account the morphology of Russian and English and the proximity of words in a sentence. Since the beginning of 2006, Yandex search has been installed on the Mail.ru portal.

In addition to HTML web pages, Yandex indexes documents in PDF (Adobe Acrobat), Rich Text Format (RTF), Microsoft Word, Microsoft Excel, Microsoft PowerPoint, SWF (Macromedia Flash), RSS (blogs and forums) binary formats.

A distinctive feature of Yandex is the ability to fine-tune the search query. This is implemented using a flexible query language. So, for example, for the exclusion operation, you can specify the scope: the query A ~ ~ B will find documents (pages) in which A is present, but C is not present, and the query A ~ B will find documents where the word B is not present with the word A in one sentence. Similarly, the & operator looks for combinations keywords in a sentence, and && in the entire document.

Operator! allows you to disable morphology for a specific word as well!! allows you to specify the normal form, which allows you to get around some problems associated with homonymy. For example, the query !!Ivanov will find Ivanov and Ivanov, but not Ivanov.

By default, Yandex displays 10 links on each results page; in the search results settings, you can increase the page size to 20, 30, or 50 found documents. Sometimes the order of the sites on these pages may differ, since the databases for these results are not updated at the same time.

If there are a lot of links found for the query, the results page suggests limiting the search range - by region (that is, by IP range) or by date. If nothing is found for any word or words, it is proposed to replace it / them with similar ones (since the proposed options depend on the frequency of finding similar words, funny situations sometimes arise). Also, it is proposed to correct the words typed in the wrong keyboard layout.

From time to time, the Yandex algorithms responsible for the relevance of the issue change, which leads to changes in the results of search queries. The last officially announced changes were in March 2004, April 2005 and January 2007; according to unofficial information, there are much more of them (for example, the last one in August-September 2007).

In particular, these changes are directed against search spam, which leads to irrelevant results for some queries (less often for entire families of queries). Against search spam, which is not automatically filtered out, semi-automatic and manual moderation of the issuance (with the help of the so-called "white hat optimizers") is used, as well as a direct refusal to index "malicious" sites.

Owners, management and performance indicators.

More than 30% of the company, according to its own data, belongs to the investment funds ru-Net Holdings and Baring Vostok Capital Partners, 15% - to the Tiger Technologies fund, about 30% - to the founders of the company and 20% - to managers and other minority shareholders.

In mid-September 2009, it became known that the parent company of Yandex, the Dutch company Yandex N.V., issued a priority share, which was transferred to Sberbank for a symbolic 1 euro. The only right that a share gives is to veto the sale of more than 25% of the company's shares.

Management: Rkady Volozh - General Director, Ilya Segalovich - Technical Director, Elena Kolmanovskaya - Editor-in-Chief, Alexei Tretyakov - Commercial Director, Svetlana Kondrashova - Advertising Director.

All Yandex services.

Information retrieval:

Search and ya.ru

Directory - a directory of websites sorted by citation index. It is replenished manually by catalog editors, there is a possibility of paid registration.

News - The top news of the day, sourced from the mainstream media featured on the Internet. It is possible to search by news, as well as subscribe to news for a given search query.

Yandex.XML - using this service, you can make automatic search queries to Yandex in xml format.

Search on blogs and forums - search for resources that have an RSS-representation, as well as a rating of current queries, popular categories and news.

Market - search for offers for the sale of goods and services, selection of models.

"Meditative" search is the only search service in the world that has a "Search" button, but no search bar.

Dictionaries - encyclopedias, reference books, translation dictionaries.

Pictures - image search.

Video - video search.

Maps - maps of Europe and Russia, maps of major cities of the Russian Federation (up to the house), search on the map, as well as the ability to "wander" through the streets of some cities. [source?]

Addresses - search for contact information by the names of firms and organizations.

Poster - information about available events: cinema, theater, concerts, sports, clubs, etc.

Weather - weather forecast.

TV program - programs of central, regional and satellite channels TV.

Timetables - timetables for trains and planes.

Personalized:

Yandex.Video - video hosting and video search.

Mail - email.

Ya.ru is a blogging service.

Yandex.Fotki - photo hosting.

Spam defense - spam filtering.

People - free hosting for personal web pages, as well as a file storage service.

Yandex money - payment system, which allows you to pay for goods and services on the Internet.

Bookmarks is a bookmark storage system integrated with Yandex. bar."

Subscriptions - subscription to news.

Feed - online RSS reader

Yandex.Direct is a system for placing contextual advertising with pay per click.

The Cup is a regular Internet search competition.

Cities - Internet indices of Russian cities.

Tariff - search by tariffs of Internet providers.

Postcards

Spring - automatic generation of philosophical essays.

Internet - measures the speed of the Internet connection.

Mirror - A mirror of major Linux OS distributions, as well as FreeBSD and other projects.

Yandex. Local network - provides an opportunity to use all Yandex services not at the federal, but at the local rate.

Metrica - allows you to measure traffic, analyze user behavior and evaluate the effectiveness of advertising campaigns.

Software products:

Spam filter Spamodefense for corporate use (paid).

A program for searching Yandex Desktop Search files on a computer.

Ya.Online instant messaging program based on Jabber. It also allows you to receive notifications about new letters from Yandex. Mail, about new events from sites Odnoklassniki.ru and VKontakte.

The Punto Switcher program is an automatic layout switcher.

Widgets for operating rooms Mac systems OS X and Windows Vista, as well as for Opera browser: Search, Traffic, Clock, News.

Yandex ICQ - a special version of the ICQ client with symbols and integration of some services from Yandex.

Interesting facts.

1) The average length of a query in Yandex.Ru now is 2.7 words. In 1997, it was 1.2 words, when search engine users were accustomed to telegraphic style.

2) Yandex appeared before www.yandex.ru. The word Yandex was invented in 1993, and it was publicly uttered in 1996 and then meant not a company or a search engine, but a search technology on its own server and a morphological prefix to the Altavista.com search engine.

3) www.yandex.ru was launched to demonstrate the capabilities of Yandex technology, no one thought about making money on advertising.

4) The slogan “There is everything” was invented in 2000. In the same year, Yandex launched the first advertisement for the website on Russian television.

5) According to Yandex itself, about 80 percent of its audience is from Russia, about 3 percent from Europe, and just over 1 percent from the United States.

6) Some of the Yandex technical support staff operates under the collective pseudonym "Platon Shchukin".

Conclusion.

So, now we have complete information about Yandex. We know who manages it, how it works from the inside, what is the history of the company's development and much more. Now we can easily understand why Yandex is the leader in the Russian and global markets. I think the main reason for the success of Yandex is that the search engine copes well with the complexities of the Russian language. That is why search engines that were developed for English cannot index and rank Russian-language documents as well. The second advantage I see is the creative, friendly, cheerful slogans with which Yandex attracts users to use its services. Thematic pictures that Yandex places near its search line are much more accessible for a Russian user.

Leaders, trend growth in the number of proposals will continue. Those present today market electronic payment systems... more one milestone event: Paycash signed an agreement with the largest search engine system ...

Volga Federal District: contemporary status and prospects development(on the example of the Republic of Tatarstan)

Coursework >> Economics

... trends further development. ... leader. ... development one from the most important ... complex search and aerobatic... market. Development ... contemporary technologies, high-performance equipment, contemporary... supertoxicants; - development systems land monitoring...

Modern sociological problems of physical culture and sports

Abstract >> Sociology

To promote political leaders, parties, ... the total subject-object system socio-pedagogical ... creative search engine activity... market and the state. Market ... Trends development contemporary Olympic Movement Russia is one from ...

Trends development oil industry in the global economy

Abstract >> Economics

World market oil: trends development and... already carried out search-exploration work, ... Preliminary assessment. leader in world consumption... is one from essential elements contemporary world economic... world economic system, at the time...

To search in the index, the user must formulate a query and send it to the search engine. The request can be very simple, at least it should consist of one word. To build a more complex query, you need to use Boolean operators that allow you to refine and expand the search conditions.

The most commonly used Boolean operators are:

AND - all expressions connected by the "AND" operator must be present on the searched pages or documents. Some search engines use the "+" operator instead of the AND word.
OR - at least one of the expressions connected by the "OR" operator must be present on the searched pages or documents.
NOT - the expression or expressions following the "NOT" operator should not (should not) appear on the searched pages or documents. Some search engines use the "-" operator instead of the word NOT.
FOLLOWED BY - one of the expressions must immediately follow the other.
NEAR - one of the expressions must be at a distance from the other, no more than the specified number of words.
Quotes - Quoted words are treated as a phrase to be found in a document or file.

Prospects for the development of search engines

The search given by boolean operators is literal - the machine searches for words or phrases exactly as they are entered. This can cause problems when the entered words are ambiguous. For example, the English word "Bed" can mean a bed, a flower bed, a place where a fish spawns, and much more. If the user is only interested in one of these meanings, he does not need pages with a word that has other meanings. It is possible to build a literal search query aimed at cutting off unwanted values, but it would be nice if the search engine itself could provide appropriate assistance.

One of the variants of the search engine is a conceptual search. Part of this search involves using statistical analysis of pages containing the words or phrases entered by the user to find other pages that might be of interest to that user. It is clear that conceptual search needs to store more information about each page, and each search query will require more calculations. Many development teams are currently working on improving the performance and performance of these types of search engines. Other researchers have focused on a different area, which is called natural-language queries (natural-language queries).

The idea behind natural language queries is for the user to formulate the query in the same way as they would ask the person sitting next to it - without having to keep track of boolean operators or complex query structures. The most popular natural language search site today is AskJeeves.com, which analyzes the query to identify keywords that are then used to search the site index built by this search engine. This site only handles simple searches, but the developers are in a highly competitive environment developing a natural language search engine capable of handling very complex queries.

Modern search engines are the most powerful hardware and software systems, the purpose of which is to index documents on the Internet to provide data at the request of users.

To provide quality and relevant information, search engines have to constantly improve their ranking formulas. Ensuring maximum High Quality issuance for users and preventing the manipulation of it by optimizers - these are the key goals of the development of search engines.

At a time when search engines were just beginning to appear, their ranking algorithms were very primitive. Thanks to this, the most resourceful optimizers began to promote their sites so that they appear in the search results for the queries they are interested in. As a result, this led to the fact that resources that often did not carry the user any useful information, became the first, thereby relegating more useful sites to the background.

In response to these actions, search engines have become defensive, improving their ranking algorithms, introducing more and more variables into the formulas and taking into account all the new factors. Over time, this struggle between optimizers and search engines moved to a new level and contributed to the emergence of more advanced algorithms based, including on machine learning.

Stages of development of search engines:

As you can see from the diagram, the development of search engines and their algorithms goes in circles. Some create new algorithms, others adapt to them. It is difficult to say whether this process will ever stop, but personally I am inclined to believe that it will not. Despite the fact that search engine ranking algorithms have recently not only changed the significance of various factors, but also changed qualitatively, this does not frighten optimizers: their arsenal is constantly replenished with more and more new techniques.

How often do search engines change their algorithms?

Let's turn to the main search engine of the Runet - Yandex. Qualitative and fundamental changes in the ranking formulas in it occur on average once a year. Not so long ago, Yandex introduced a new search platform called Kaliningrad. Its essence lies in the formation of personal results for each user based on his search history and preferences.

In addition, do not forget that every search engine, including Yandex, constantly has “twisting” ranking formulas, when in automatic or semi-automatic mode the influence of certain factors is underestimated, while others, on the contrary, are increased. All this is done with only one goal - to maximize the search results, ridding it of sites that do not meet the needs of users, and thereby increase its relevance.

Looking at the changes in the Google search engine, you can see that the transformation of the ranking formula also occurs constantly, and Google itself reports hundreds of small changes from year to year. But if we are not talking about the ranking formula, but about the filters that help Google clear the search results from low-quality sites, then new versions of algorithms, such as Panda or Penguin, appear every 3-6 months.

The answer to the above question can be as follows: search engines are constantly improving their ranking algorithms, and cardinal changes occur on average once every 6-12 months.

Which search engine algorithms pose a real threat to promotion?

I would like to answer “at the rally” - none, but still, let's figure it out. And for this we need to ask ourselves the question - do search engines make it their goal to prevent search promotion?

I think not. There are several reasons for this:

1. Optimizers help search engines improve their algorithms, which ultimately leads to better search results. After all, if there were no optimizers, then search engines, most likely, would have stopped in their development in the year 2000.

2. Without optimizers, the results for many commercial queries would look like a collection of abstracts and useless informational articles.

If search promotion did not exist in principle, then it would not make sense for search engines to grow and develop as intensively as they do now.

Thus, we come to the following conclusion:

Search engines and SEO are closely and inextricably linked with each other. That is why, following the rules established by them, you can absolutely not be afraid of algorithms, because PS do not aim to destroy SEO as such.

Development of search engine services

Speaking of search engines, do not forget that Yandex, Google or Bing have their own services designed to help users. In addition to search results, over the years of evolution, PSs have studied the behavior of their users in order to increase satisfaction with the results of issuance.

Actually, for this, the Yandex search engine came up with a mechanism for the so-called. "Sorcerers" who help the user quickly get an answer to their question. So, for example, when you enter the query "weather forecast", Yandex will display weather information on the search results page directly on the search results page. current date, thus saving the user from the need to navigate through the results of the issuance.

Other search engines, such as Google, went further and instead of "Sorcerers" offered a more interesting solution - the "Knowledge Graph".

“Knowledge Graph”(from English Knowledge Graph) is the first step on Google's path to intelligent search. Thanks to this innovation, the search engine displays in the search results not only standard links, but also direct answers to user questions, a brief reference about the query object and information about facts related to it. Technically, the “Knowledge Graph” is a semantic network that links together various entities: individuals, events, spheres of life, things, categories. The information base for the “knowledge graph” is a number of sources: the open semantic database Freebase, Wikipedia, the CIA open data collection and other sources.

What conclusions can be drawn, you ask?

The answer is simple: search and search services will continue to develop towards quick and relevant answers to user questions, providing the opportunity to get all the necessary information directly in the SERP (issue) and eliminating the need to go to other sites.

There is an opinion that search engines, by their desire to answer the user's question here and now, can destroy search engine optimization, becoming a kind of global knowledge base. But such fears are groundless, because in order to become global knowledge bases, they need information, and it is stored by the very sites that the very optimizers work on, who are involved in the fact that search engines do not stand still, but constantly evolve.

As you can see, both SEO and search engines are links in the same chain that cannot exist without each other. Therefore, thoughts about the imminent death of SEO are unfounded. It is possible that search engine optimization evolves over time, for example, in consulting, but certainly will not die. I wish you all a successful promotion to the TOP!

Introduction

3.1 Gopher

3.2 WAIS

3.4 AltaVista

3.5 Opentext

3.6 Infoseek

4. Search robots

5.1 Rambler

5.2 Yandex

5.3 Port

6.1 Google

6.2 Yahoo

7.1 Baidu search engine

8. Prospects for the development of search engines

Conclusion

Bibliography

Introduction

Each user on the Internet can find a lot of diverse and interesting information, as well as to use all the richest possibilities of the network. The selected topic of the essay is very relevant today, because. Search engines are indispensable today, due to the extremely frequent visits to the World Wide Web. Internet resources have become a tool for the daily work of people of many professions. The rapid growth of information on the web has made it an ocean of diverse data, the importance of which grows in proportion to their volume. According to experts, the volume of information transmitted via the Internet is doubling every six months. Every day, millions of new documents appear on the web, and it is natural that without search engines, the vast majority of them would remain unclaimed, they would not be found by anyone at all, and all that huge amount of information would turn out to be useless. There was a need to create such tools that would make it easy to navigate the information resources of global networks, quickly and reliably find the information you need. There are special search tools on the Internet. A few years ago, there was such an opinion: the Internet has everything, but it is impossible to find anything there. However, with the advent and rapid development of search directories, search engines, and all kinds of search programs, the situation has changed, and now urgently needed information can sometimes be found on the Web faster than in a book lying on the table.

Unfortunately, search engines often fail to accurately and fairly interpret resources. As a result, sites "far" from the issue being solved often appear in the first positions of the search. At the same time, resources that are of real use are "overboard" in the search.

search engine internet robot

The reason for this situation is simple and lies in the technology of obtaining and presenting results by search engines. Paradoxical as it may seem, this is not the fault of search engines, since they are obliged to hide the rules for building search indexes. It is the fault of the technology itself in organizing the search

A search engine is software that provides access to a collection of semi-structured information. Orientation to semi-structured data, i.e. data that cannot be represented as a relational table distinguishes a search engine from a DBMS.

In this definition of a search engine, information of various kinds is implied, i.e. text, audio, video, images, etc. However, it should be noted that it is textual data that is ideal for describing the full functionality of a search engine, because multimedia information search algorithms are primarily based on text search algorithms.

The main task of the search engine is to minimize the time spent by the user searching for the desired information. The question is, what information does the user want? In some circumstances, relevant information can be defined as all information in the database that is relevant to a query. Traditionally, two main characteristics are applied to the search engine: accuracy and completeness, or rather, their dependence. Each time the user submits a query to the system, thereby initiating a search, all documents in the search engine's collection are divided into four parts. Accuracy determines one aspect of search, namely how well a search engine is able to minimize the time a user spends searching for relevant information. this request information. While completeness determines another aspect - how well the system is able to find information relevant to a given query. One can choose the optimal query(s) when every document found is relevant and every relevant document is found.

Search engines play a very important role when using the Internet. There is so much information on the Internet that its search is already turning into a separate task and takes a lot of time. Search engines give thousands of links per query instead of a few pages where there really is necessary information. Users of the World Wide Web, realizing the benefits provided by the ability to analyze spatial data, need a tool that allows them to quickly and easily search and access digital images of the terrain and other spatial information concentrated in many government, commercial and academic organizations.

1. The history of the development of search engines

One of the first ways to organize access to information resources network was the creation of directories of sites in which links to resources were grouped according to the subject. The first such project was the Yahoo website, which opened in April 1994. After the number of sites in the Yahoo directory increased significantly, the ability to search for information in the directory was added. This, of course, was not a search engine in the full sense, since the scope of the search was limited only to the resources present in the catalog, and not to all resources on the Internet.

Link directories were widely used in the past, but have practically lost their popularity at the present time. The reason for this is very simple - even today's catalogs, containing a huge amount of resources, provide information about only a very small part of the Internet. The network's largest directory DMOZ (or Open Directory Project) contains information on 5 million resources, while the Google search engine database consists of more than 8 billion documents.

The first full-fledged search engine was the WebCrawler project, which appeared in 1994.

In 1995, the search engines Lycos and AltaVista appeared. Last long years was a leader in the field of information search on the Internet.

In 1997, Sergey Brin and Larry Page created Google, the most popular search engine in the world today.

September 1997, the Yandex search engine was officially announced, the most popular in the Russian-speaking part of the Internet.

Currently, there are 3 major international search engines - Google, Yahoo and MSN Search, which have their own databases and search algorithms. Most other search engines (of which there are a lot) use the results of the 3 listed in one form or another. For example, AOL search (search. aol.com) and Mail.ru use the Google base, while AltaVista, Lycos and AllTheWeb use the Yahoo base.

In Russia, the main search engine is Yandex, followed by Rambler, Google.ru, Aport, Mail.ru and KM.ru

AltaVista-search system. The name "AltaVista" literally translates as "view from above".

Initially, the AltaVista search engine was a true innovator in the creation of search technologies. In 1995, Alta Vista was created as one of the equipment elements of the Digital Equipment Corporation (DEC) research laboratory. Having appeared, the AltaVista search engine quickly gained recognition from users and became a leader among its own kind. The main merit of the AltaVista system is the support for many languages, including Chinese, Japanese and Korean. Indeed, in 1997, not a single search engine on the Web worked with several languages, especially with rare ones.

In 1998 Compaq Computer Corporation bought DEC (along with AltaVista). And already in early 1999, AltaVista received the status of an independent division. That same year, Microsoft licensed the AltaVista search engine for use on its MSN site. Many people immediately began to use the services of indexing large amounts of information and the ability to instantly search in huge databases. At the same time, the address of the search engine remained the same - altavista. digital.com.

And typing altavista.com in the address bar led to the site of AltaVista Technology. As a result, the popularity of the search engine resulted in a huge influx of visitors to the AltaVista Technology site and a loss of potential search engine users. As a result, the altavista.com domain was purchased by Compaq for $3.35 million in August 1998 (the largest deal of its kind at that time). Despite this, Compaq never managed to profit from the search engine. Therefore, in June 1999, negotiations began between Compaq and CMGI Corporation to form a strategic network alliance, in which AltaVista was sold to CMGI. On August 19, 1999, CMGI Corporation announced that it had acquired an 83% stake in AltaVista from Compaq.

In February 2003, AltaVista was acquired by Overture Services, Inc., which was acquired by Yahoo in July 2003. Since May 2011 AltaVista has switched to Yahoo search technology.

The AltaVista search engine, on the other hand, sought to be a one-stop portal that included an online store, radio station, forums, chat rooms, personal photo albums, and more. But, due to huge cash injections, due to competition with other giant portals and published criticism from the same competitors, 2001 passes for the company under the motto of abandoning claims to portal status and "returning to the roots .

The company turned its activities in a different direction. Now www.altavista.com promotes its search engine to individual Internet users and licenses search technology to businesses, including for use on intranets. The main source of funding for the consumer version of the AltaVista search engine was advertising revenue, including from the most popular ones. For example, real search results are now placed after the link, for which AltaVista is paid by the owner of the respective resource.

Along with its efforts to become a portal, AltaVista continued to improve its search technology.

Also, another source of income for AltaVista is the development of corporate internal search engines.

Despite the obvious lagging behind the competitors, www.altavista.com is absolutely confident in its abilities. We hope that the Alta Vista company will fulfill all the plans and successfully return to its roots. . The AltaVista search engine (www.altavista.com) won the hearts of all Internet users at an early stage of its existence. Its history is a classic example of combining good technology with vague positioning.

2. How search engines work

Search and structuring tools, sometimes referred to as search engines, are used to help people find the information they need. Search tools such as agents, spiders, crawlers and robots are used to collect information about documents located on the Internet. These are special programs that search for pages on the Web, extract hypertext links on those pages, and automatically index the information they find to build a database. Each search engine has its own set of rules that determine how documents are found and processed. Some follow each link on each page they find, and then in turn examine each link on each of the new pages, and so on. Some people ignore the links that lead to graphics and sound files, animation files; others ignore references to resources such as WAIS databases; others are instructed to look at the most popular pages first.

Agents are the most "intelligent" of search tools. They can do more than just search: they can even carry out transactions on your behalf. Already, they can search for specific sites and return lists of sites sorted by their traffic. Agents can process the content of documents, find and index other types of resources, not just pages. They can also be programmed to extract information from pre-existing databases. Whatever information the agents index, they pass it back to the search engine database.

The general search for information on the Web is carried out by programs known as spiders. The spiders report the content of the found document, index it, and extract the resulting information. They also look at the titles, some of the links, and send the indexed information to the search engine's database.

Crawlers look at headers and only return the first link.

Robots can be programmed to follow different links of different nesting depths, perform indexing, and even check links in a document. Due to their nature, they can get stuck in cycles, so they require significant Web resources to follow links, however, there are methods designed to prevent robots from searching sites whose owners do not want them to be indexed.

Agents fetch and index different kinds information. Some, for example, index every single word in an encounter document, while others index only the most important 100 words in each, index the document's size and number of words, title, headings and subheadings, and so on. The type of index built determines what kind of searches can be made by the search engine and how the resulting information will be interpreted.

Agents can also surf the Internet and find information and then put it into a search engine database. Search engine administrators can determine which sites or types of sites agents should visit and index. The indexed information is sent to the search engine database in the same way as described above.

People can put information directly into the index by filling out a special form for the section in which they would like to put their information. This data is passed to the database.

When someone wants to find information available on the Internet, he visits a search engine page and fills out a form detailing the information he needs. Keywords, dates, and other criteria can be used here. The criteria in the search form must match the criteria used by agents when indexing the information they find while navigating the Web.

The database looks up the subject of the request based on the information provided in the completed form and outputs the corresponding documents prepared by the database. To determine the order in which the list of documents will be shown, the database uses a ranking algorithm. Ideally, the documents most relevant to the user's query will be placed first in the list. Different search engines use different ranking algorithms, but the basic principles for determining relevance are as follows:

The number of query words in the text content of the document (i.e. in the html code).

Tags in which these words are located.

The location of the search words in the document.

The share of words with respect to which relevance is determined in the total number of words in the document.

These principles apply to all search engines. And the ones below are used by some, but quite well-known (like AltaVista, HotBot).

Time - how long the page is in the search engine database. At first glance, this seems like a rather nonsensical principle. But, if you think about it, how many sites exist on the Internet that live for a maximum of a month! If the site has been around for a long time, this means that the owner is very experienced in this topic and the user is more suitable for a site that has been broadcasting to the world about the rules of conduct at the table for a couple of years than one that appeared a week ago with the same topic.

Citation index - how many links to this page lead from other pages registered in the search engine database.

3. Comparative review of reference and search systems

3.1 Gopher

Gopher was widespread on the Internet and was the forerunner of the World Wide Web. According to some reports, until 1995, Gopher was the fastest growing technology on the Internet. The growth rate of the number of eligible servers outpaced that of all other types of servers. In 1993, there were more than one and a half thousand gopher servers in the world. In fact, it was a system for distributed search and transmission of documents at the same time. Moreover, these features were not implemented as additional add-on services, like modern search engines, but were built into the system as its basic functions.

With the help of a special Veronica program, a search was carried out directly in the Gopher system using a special query language built on keywords. This system worked not only long before the advent of GOPHER (RFC-1436) is a system for searching and delivering documents stored in distributed depositories. The system was developed at the University of Minnesota (the coat of arms of this state depicts a hamster, in English gopher). The Gopher program presents the user with a sequence of menus from which they can select a topic or article of interest. The search object can be a text or a binary file (in many depositories even text files are stored in an archived, and therefore binary form), a graphic or sound image. Gopher also offers gateways to other search engines such as WWW, Wais, Archie, Whois, and network utilities such as telnet or FTP. Gopher can offer more directory convenience than FTP. For access to global network Gopher uses a client-server model. The Gopher system is now outdated, many of its servers are integrated into WEB network. But gopher was the prototype of modern WWW interfaces, and that's what makes it interesting.

3.2 WAIS

WAIS is one of the most sophisticated search engines on the Internet. It does not implement only search by fuzzy sets and probabilistic search. Unlike many search engines, the system allows you to build not only nested Boolean queries, calculate formal relevance by various proximity measures, weight query and document terms, but also correct the query by relevance. The system also allows you to use term truncation, splitting documents into fields, and maintaining distributed indexes. It is no coincidence that this system was chosen as the main search engine for the implementation of the Britannica Encyclopedia on the Internet.

The distributed information system WAIS was conceived as a network analogue of traditional information retrieval systems (IRS), allowing network users to search in full-text databases using the information retrieval language traditional for IS, the search prescriptions of which are based on keywords and/or their truncations. , interconnected by logical operators 0R or AND.

Initially, the WAIS system was developed by four firms: Dow Jones and Co. (business databases); Think Machines Corporation (information retrieval systems); Apple Computer (user interface) and KPMG Peat Maverick (work with a large number of users). The first prototype of WAIS was a semi-commercial semi-research system with severe restrictions on use by both users and database administrators. The WAIS prototype understood the natural English language and translated it into the system's search prescriptions. In reality, WAIS only became widely used with the advent of the FreeWAIS version for UNIX operating systems. Today there is a large number of implementations of WAIS, mainly commercial, and the system has become a kind of standard information retrieval engine on the Internet.

When working with WAIS, users do not have to spend a lot of time to find the materials they need.

There are more than 300 WAIS libraries on the Internet. But since the information is presented mainly by volunteers from academic organizations, most of the material is in the field of research and computer science.

3.3 WWW

WWW is a system for working with hypertext. It is potentially the most powerful search tool. Hypertext connects different documents based on a predefined set of words. For example, when a new word or concept is encountered in a text, a hypertext system makes it possible to navigate to another document in which that word or concept is discussed in more detail. Often used as an interface to WAIS databases, but the lack of hypertext links limits the possibilities. WWW to simple browsing like Gopher.

The user, for his part, can use the hypertext capability of the WWW to link between his data and the WAIS and WWW data in such a way that the user's own records are somehow integrated into the information for public access. In fact, this, of course, does not happen, but it is perceived that way.

3.4 AltaVista

Indexing in this system is carried out using a robot. In this case, the robot has the following priorities:

key phrases at the top of the page;

key phrases by the number of occurrences/presence of words/phrases;

If there are no tags on the page, it uses the first 30 words that it indexes and shows instead of a description (tag description)

Most interesting opportunity AltaVista is an advanced search. Here it is worth mentioning right away that, unlike many other systems, AltaVista supports a single NOT operator. In addition, there is also the NEAR operator, which implements the possibility of contextual search, when the terms should be located side by side in the text of the document. AltaVista allows searching by key phrases, while it has a fairly large phraseological dictionary. Among other things, when searching in AltaVista, you can specify the name of the field where the word should occur: hypertext link, applet, image name, title, and a number of other fields. Unfortunately, the ranking procedure is not described in detail in the system documentation, but it can be seen that the ranking is applied both in a simple search and in an extended query. In fact, this system can be attributed to a system with an extended boolean search.

3.5 Opentext

The OpenText information system is the most commercialized information product online. All descriptions are more like an advertisement than an informative job guide. The system allows you to search using logical connectors, but the query size is limited to three terms or phrases. In this case, we are talking about advanced search. When issuing the results, the degree of compliance of the document with the request and the size of the document are reported. The system also allows you to improve search results in the style of a traditional boolean search. OpenText could be classified as a traditional information retrieval system if it were not for the ranking mechanism.

3.6 Infoseek

In this system, the index is created by a robot, but it does not index the entire site, but only the specified page. In this case, the robot has the following priorities:

words in the title have the highest priority;</p><p>words in the keywords, description tag and the frequency of occurrences/repetitions in the text itself;</p><p>when repeating the same words side by side, it is thrown out of the index</p><p>allows up to 1024 characters for the keywords tag, 200 characters for the description tag;</p><p>if no tags were used, indexes the first 200 words on the page and uses it as a description;</p><p>The Infoseek system has a fairly advanced information retrieval language that allows not only to indicate which terms should be found in documents, but also to weight them in a peculiar way. This is achieved with the help of special signs "+" - the term must be in the document, and "-" - the term must be absent in the document. In addition, Infoseek allows you to conduct what is called <a href="https://shyza.ru/en/kak-rabotaet-kontekstnyi-poisk-informacii-v-poiskovikah-kak-rabotaet.html">context search</a>. This means that using a special query form, it is possible to require consecutive co-occurrence of words. You can also specify that some words should occur together not only in one document, but even in a separate paragraph or heading. It is possible to specify key phrases that are a single whole, up to the order of words. Ranking when issuing is carried out by the number of query terms in the document, by the number of query phrases minus common words. All of these factors are used as nested procedures. In a nutshell, Infoseek is a traditional system with an element of term weighting in the search.</p><p><b><i>4. Search robots</i> </b> <br></p><p>In recent years, the World Wide Web has become so popular that the Internet is now one of the main means of publishing information. As the size of the Web grew from a few servers and a small number of documents to enormous limits, it became clear that manual navigation of much of the hypertext link structure was no longer possible, let alone <a href="https://shyza.ru/en/chem-zakleit-treshchinu-na-stekle-naruchnyh-chasah-ubiraem-carapiny.html">effective method</a> resource research.</p><p>This problem has prompted Internet researchers to experiment with automated Web navigation, called "robots." A web robot is a program that navigates the hypertext structure of the Web, requests a document, and recursively returns all documents that <a href="https://shyza.ru/en/sekretnye-arhivy-rossii-nbsp-baza-dannyh-rassekrechennyh-del-i.html">this document</a> refers. These programs are also sometimes called "spiders", "wanderers", or "worms" and these names are perhaps more attractive, however, can be misleading, since the terms "spider" and "wanderer" create a false impression that the robot itself moves , and the term "worm" could imply that the robot also reproduces like an Internet worm virus. In reality, the robots are implemented as a simple <a href="https://shyza.ru/en/samsung-j1-mini-tehnicheskie-harakteristiki-smartfon-samsung-j1-mini.html">software system</a>, which requests information from remote parts of the Internet using standard network protocols.</p><p><b><i>5. The most popular Russian-language reference and search engines on the Internet</i> </b> <br></p><p><b><i>5.1 Rambler</i> </b> <br></p><p>The Rambler search engine began its existence in 1996. Today it is one of the most popular in RuNet, second only to Yandex (by popularity). According to SpyLog, Rambler accounts for 20-25% of all RuNet search queries.</p><p>The Rambler search engine takes into account the morphology of the Russian language when searching, which provides more opportunities for effective information retrieval. A system of so-called "dressings" has also been implemented, which allows you to display in the search results not only pages containing the query, but also words that are synonyms for the query. Another function of "dressings", I think more significant, is the issuance <a href="https://shyza.ru/en/zarabotok-na-kontekstnoi-reklame-admitad-kak-zarabotat-v-cpa.html">contextual advertising</a> not only for a specific request, but also for requests that are closely related to the original one, this allows you to cover a larger number of target audiences.</p><p>Rambler is rightfully considered the first major <a href="https://shyza.ru/en/mozhno-li-zarabotat-na-klikah-v-admitad-kak-zarabotat-v-cpa-partn-rke-admitad.html">advertising platform</a> of the Russian Internet and stands at the origins of the classic network advertising business. <br></p><p><b><i>5.2 Yandex</i> </b> <br></p><p>To date, it has the largest database, which has a cluster structure and is hosted on several servers.</p><p>In 1996, the company CompTek, created with 100% American participation, officially announced the existence of Yandex at the Internetcom exhibition. It was a morphological prefix to "Altavista", which was distinguished by its speed and ability to build hypotheses. The word index for unfamiliar words is organized in the same way as for dictionary ones - this distinguishes Yandex from other search engines.</p><p>September 1997 "Yandex" became an Internet project. The relevance of documents was calculated depending on <a href="https://shyza.ru/en/sintez-lineinyh-sau-chastotnym-metodom-sintez-sau-metodom.html">frequency characteristics</a> search words, the weight of a word or expression, the proximity of the search words in the text of the document to each other, and so on. And the main innovation of this search engine, which required an inevitable restructuring of the core, is ranking by links. Other innovations relate mainly to the reformulation of user queries by the system: "what is an item" is converted to "item is.", and if the query begins with the word "how", then in the results they first of all try to issue a FAQ or other reference document . The new "Yandex" began to "understand" alternative vocabulary, which is included in 5 percent of requests. Only in <a href="https://shyza.ru/en/kakaya-samaya-novaya-versiya-ios-kakaya-poslednyaya-versiya-ios-dlya-moego.html">latest version</a> Yandex citation index was directly used by the search engine.</p><p>Currently, Yandex has the most complete database of documents among Russian search engines, as well as the most recognizable brand. <br></p><p><b><i>5.3 Port</i> </b> <br></p><p>The Aport search engine was first demonstrated in February 1996 at the Agama press conference on the opening of the Russian Club. Then she searched only on the site russia. agama.com. The system was created by Agama, a developer of software for <a href="https://shyza.ru/en/kak-ustanovit-kriptopro-plagin-kripto-pro-ecp-brauzer-plagin-ustanovka-i.html">Windows platforms</a>, the main of which was the spelling corrector "Propis". The linguistic developments of "Agama" were used to create a search engine, in which, say, unlike "Rambler", the morphology of words was initially taken into account and the spelling of the request was checked at the request of the client.</p><p>The most important properties of the first version of "Aport" was the translation of the query and search results into English and vice versa, as well as the reconstruction of all indexed pages from its own database (which means the ability to view pages that no longer exist in the original).</p><p>"Aport 2000" became the first Russian search engine built on the basis of issuing results for individual sites. To separate resources into sites, information is used that "Aport" provides the AtRus catalog or information entered into "Aport" by the owners of the resources. At worst, you have to rely on an algorithm that allows you to select individual sites by some formal criteria.</p><p>Users of "Aport" (unlike the regulars of "Yandex") use advanced search little (for 8000 downloads of a simple page, there are 300 calls to the "Advanced search" page).</p><p><b><i>6. The most popular foreign search engines for a Russian-speaking user</i> </b> <br></p><p><b><i>6.1 Google</i> </b> <br></p><p>The name of the search engine Google was formed as a result of a play on letters in the word "googol". This company wants to emphasize their intention to index and process large amounts of information.</p><p>You can search Google in 10 different languages. You can also customize the interface to the language you need. For example, if you are looking for a German site, then you can enter a query in German, and all auxiliary interface labels will be in German.</p><p>A very handy feature is "cache". With this feature, the user can view an indexed page even if the page is deleted or the server hosting the page is unavailable. You can also use this feature to research your competitors, it also helps to better understand how a page is indexed by a search spider (robot).</p><p>FROM <a href="https://shyza.ru/en/nastroika-rss-lenty-na-pochtu-chto-takoe-rasshirenie-faila-rss-podpiska-na-rss-s.html">Google</a> you can find pages that are not contained in its database. This is possible because the search spider indexes the link text from the pages. <br></p><p><b><i>6.2 Yahoo</i> </b> <br></p><p>Surprisingly, this incredibly popular system, which serves millions of requests daily, began as a simple collection of bookmarks, which was replenished by only 2 people - David Filo and Jerry Yang. Today, Yahoo is no longer just a directory, it is a whole group of various services, including such as the Yahooligans directory - Yahoo for children, the My Yahoo personal channel system, the free E-mail service, the "Shop with Yahoo" system (buy with Yahoo ), a joint project with MTV MTV unfURLed and much more. Among all the systems reviewed, Yahoo is the only purely directory system; Yahoo does not have its own search engine. But the list of categories on Yahoo is the most complete and simple - unlike other directories, on Yahoo it is always easy to determine in which section the necessary information is located. The Yahoo home page loads very quickly - although it has a lot of links, they are all text. The central part of the page is, of course, occupied by a search box and a list of categories. Links at the top of the page (graphics) provide access to information such as "what's new", "what's good", "More Yahoos". The last link is recommended to visit - it leads to a page with a huge number of links to various Yahoo directories and services. When setting search criteria for Yahoo, keep in mind that Yahoo only looks for these words in the title and description of the page, since there is no full-text index on Yahoo. Therefore, you should not specify too many terms or synonyms when searching - the number of results from Yahoo will decrease or even be zero. The number of search results on Yahoo is naturally small, but most of them are relevant. For advanced search Yahoo offers a small but very useful set of tools. To get to the advanced search page, you need to follow the "options" link from the main Yahoo page.</p><p><b><i>7. Search engine market in China</i> </b> <br></p><p><b><i>7.1 Baidu search engine</i> </b> <br></p><p>Baidu was founded in 2000 - much later than the world leaders in web search, however, it literally broke into the top ten most visited sites in the world, this is facilitated by the rapid growth of the audience of Internet users in China (as of January 2010 - 360 million! ) .</p><p>The Baidu.com site in China is known to all Internet users: it is not only the most popular Chinese search engine, but also the most visited site in China (according to statistics from Alexa the Web Information Company, at the beginning of March 2010, Baidu is the 8th most visited site in the world). The Baidu index contains about 800 million web pages (including more than 100 million in Chinese), about 100 million images and over 15 million media files.</p><p>According to the ComCore agency, Baidu processes over 10 billion search queries every month (for comparison: Yandex processes about 3 billion queries per month).</p><p>According to the Shanghai agency Iresearch, Baidu controls 63% of the Chinese Internet search market (Google is in 2nd place - 33%).</p><p>In addition to its main purpose - search - Baidu provides users with the following services:</p><p>Baydupedia is a free and "correct" encyclopedia;</p><p>Baidu. Posts - numerous forums on various topics;</p><p>Baidu. Space - blog and photo album;</p><p>Baidu. Money - payment system;</p><p>Baidu. Download - own file-sharing system;</p> <li>GNU (a recursive acronym for GNU's Not UNIX - "GNU is not Unix!") is a free UNIX-like operating system project started in 1983 by Richard Stallman.</li><li>I. Declaration-application for certification of the quality system II. Initial data for a preliminary assessment of the state of production</li> <p>The search given by boolean operators is literal - the machine searches for words or phrases exactly as they are entered. This can cause problems when the entered words are ambiguous. For example, the English word "Bed" can mean a bed, a flower bed, a place where a fish spawns, and much more. If the user is only interested in one of these meanings, he does not need pages with a word that has other meanings. It is possible to build a literal search query aimed at cutting off unwanted values, but it would be nice if the search engine itself could provide appropriate assistance.</p> <p>One of the variants of the search engine is a conceptual search. Part of this search involves using statistical analysis of pages containing the words or phrases entered by the user to find other pages that might be of interest to that user. It is clear that conceptual search needs to store more information about each page, and each search query will require more calculations. Many development teams are currently working on improving the performance and performance of these types of search engines. Other researchers have focused on a different area, which is called natural-language queries (natural-languagequeries).</p> <p>The idea behind natural language queries is for the user to formulate the query in the same way as they would ask the person sitting next to it - without having to keep track of boolean operators or complex query structures. The most popular natural language search site today is AskJeeves.com, which analyzes the query to identify keywords that are then used to search the site index built by this search engine. This site only handles simple searches, but the developers are in a highly competitive environment developing a natural language search engine capable of handling very complex queries.</p> <br><p>30. Semantic systems: definition, purpose, technical essence, classification, characteristics, architecture, examples and development prospects. Basic Principles of Semantic Web Optimization</p> <br><br><p><b>Semantic web (system) –</b> information model of the subject area, which has the form of a directed graph, the vertices of which correspond to the objects of the subject area, and the arcs (edges) define the relationship between them. Objects can be concepts, events, properties, processes. Thus, the semantic network is one of the ways to represent knowledge. The title combines terms from two sciences: semantics in linguistics studies the meaning of language units, and a network in mathematics is a kind of graph - a set of vertices connected by arcs (edges). In the semantic network, the role of nodes is played by the concepts of the knowledge base, and the arcs (moreover, directed) define the relationship between them. Thus, the semantic network reflects the semantics of the subject area in the form of concepts and relationships.</p> <p>Mathematics allows you to describe most of the phenomena in the world around you in the form of logical statements. Semantic networks originated as an attempt to visualize mathematical formulas. The main representation for the semantic web is <b>graph</b>. However, one should not forget that a strict mathematical notation necessarily stands behind a graphic image, and that both of these forms are not competing, but complementary.</p> <br><br><p>The main form of representation of the semantic network is a graph. The concepts of the semantic network are written in ovals or rectangles and are connected by arrows with captions - arcs (see Fig.). This is the most convenient form perceived by a person. Its shortcomings show up when we start to build more complex networks or try to take into account the features of natural language. Schemes of semantic networks, on which the directions of navigational relations are indicated, are called knowledge maps, and their combination, which allows covering large sections of the semantic network, is called a knowledge atlas.</p> <p>In mathematics, a graph is represented by a set of vertices V and a set of relations between them E. Using the apparatus of mathematical logic, we conclude that each vertex corresponds to an element of the subject set, and the arc corresponds to a predicate.</p> <p><b>An example of a semantic web (system)</b></p> <p>In linguistics, relationships are recorded in dictionaries and thesauri. In dictionaries, in definitions through genus and specific difference, the generic concept occupies a certain place. In thesauri, in the article of each term, all possible connections with other related terms can be indicated. From such thesauri it is necessary to distinguish information retrieval thesauri with lists of keywords in articles that are intended for the operation of descriptor search engines.</p> <p>Classification of semantic networks</p> <p>For all semantic networks, the division according to arity and the number of types of relations is valid.</p> <p>By the number of types of relationships, networks can be <b>homogeneous</b> and <b>heterogeneous</b>.</p> <p>o Homogeneous networks have only one type of relationship (arrows), such as the above species classification (with a single AKO relationship).</p> <p>o In heterogeneous networks, the number of relationship types is more than two. Classical illustrations of this model of knowledge representation represent just such networks. Heterogeneous networks are of more interest for practical purposes, but also more difficult for research. Heterogeneous networks can be represented as an interweaving of tree-like multilayer structures. An example of such a network would be the Wikipedia Semantic Network.</p> <p>By arity:</p> <p>o networks with <b>binary</b> relations (connecting exactly two concepts). Binary relations are very simple and conveniently depicted on a graph as an arrow between two concepts. In addition, they play an exceptional role in mathematics.</p> <p>o In practice, however, you may need relationships that link more than two objects - <b>N-ary</b>. In this case, there is a difficulty - how to depict such a connection on a graph, so as not to get confused. Conceptual graphs (see below) remove this difficulty by representing each relationship as a separate node.</p> <p>· To size:</p> <p>o To solve specific problems, for example, those that are solved by artificial intelligence systems.</p> <p>o S. S. industry scale should serve as a basis for the creation of specific systems, without claiming to be universal.</p> <p>o Global Semantic Web. Theoretically, such a network should exist, since everything in the world is interconnected. Perhaps someday the World Wide Web will become such a network.</p> <p>Using semantic networks</p> <p><b>Semantization</b>- the process of changing texts in which semantic relations are distinguished without changing their content. Wikipedia has projects to semantize articles and the Category Tree.</p> <p>§ The semantization of articles is mainly through the use of templates, with some categories created automatically.</p> <p>§ Semantization of the Category Tree is possible in parts after its analysis and selection of sections with generic categories</p> <p>semantic web</p> <p>The concept of hypertext organization is reminiscent of <i>homogeneous binary</i> semantic web, but there is a significant difference here:</p> <p>1. The connection made by a hyperlink has no semantics, i.e. does not describe the meaning of this connection. The purpose of the semantic web is to describe <i>interconnections</i> objects, rather than additional information on the subject area. A person can figure out why this or that hyperlink is needed, but this connection is not clear to the computer.</p> <p>2. Hyperlinked pages are <i>documents</i> describing, as a rule, the problem situation as a whole. In a semantic web, vertices (that which links relationships) are <i>concepts</i> or <i>real world objects</i>.</p> <p>An attempt to create a semantic network based on the World Wide Web is called <b>semantic web</b>. This concept involves the use of RDF (an XML-based markup language) and is intended to give links a meaning that computer systems can understand. This will turn the Internet into a distributed knowledge base on a global scale.</p> <script>document.write("<img style='display:none;' src='//counter.yadro.ru/hit;artfast_after?t44.1;r"+ escape(document.referrer)+((typeof(screen)=="undefined")?"": ";s"+screen.width+"*"+screen.height+"*"+(screen.colorDepth? screen.colorDepth:screen.pixelDepth))+";u"+escape(document.URL)+";h"+escape(document.title.substring(0,150))+ ";"+Math.random()+ "border='0' width='1' height='1' loading=lazy loading=lazy>");</script> </div> </div> </article> </div> </div> <div id="secondary"> <aside id="text-19" class="widget widget_text clearfix"> <h3 class="widget-title"><span>Advertising</span></h3> <div class="textwidget"> <p> </p> </div> </aside> <aside id="recent-posts-7" class="widget widget_recent_entries clearfix"> <h3 class="widget-title"><span>News</span></h3> <ul> <li> <a href="https://shyza.ru/en/kak-uznat-kakaya-versiya-windows-ustanovlena-na-vashem-kompyutere-kak-opredelit.html">How to determine the version of Windows on a computer How to find out the version of windows 8</a> </li> <li> <a href="https://shyza.ru/en/sobrat-sistemnyi-blok-po-komplektuyushchim-samomu-kak-samomu.html">How to Build a Gaming Computer from Scratch</a> </li> <li> <a href="https://shyza.ru/en/kak-opredelit-versiyu-windows-na-kompyutere-kak-uznat-versiyu-windows-esli-sistema.html">How to find out the version of Windows if the system does not start?</a> </li> <li> <a href="https://shyza.ru/en/zapisat-disk-mp3-s-pomoshchyu-nero-zapis-obraza-diska-s.html">Burning a Disc Image with Nero</a> </li> <li> <a href="https://shyza.ru/en/gde-komandnaya-stroka-na-vindovs-10-otkryvaem-okno-komandnoi-stroki-iz.html">Opening a Command Prompt Window from the Task Manager</a> </li> <li> <a href="https://shyza.ru/en/v-eksel-zashchitit-yacheiki-ot-izmeneniya-dannyh-kak-zashchitit-yacheiku.html">How to protect a cell from changes in Excel</a> </li> <li> <a href="https://shyza.ru/en/smartfon-htc-one-m9-frontalnaya-kamera-novyi-htc-one-m9-obzor-obnovlennogo.html">New HTC One M9 - Review of the updated smartphone from HTC</a> </li> <li> <a href="https://shyza.ru/en/skachat-programmu-dlya-opredeleniya-draiverov-na-kompyuter.html">Driver installer: how to update everything at once?</a> </li> <li> <a href="https://shyza.ru/en/udalit-dannye-gugla-s-kompa-kak-polnostyu-udalit-google-chrome-kak-udalit.html">How to Completely Uninstall Google Chrome</a> </li> <li> <a href="https://shyza.ru/en/perspektivnye-tehnologii-displeev-monitory-po-kakoi-tehnologii.html">Monitors What technology is used to make flat-panel monitors</a> </li> </ul> </aside> <aside id="text-20" class="widget widget_text clearfix"> <h3 class="widget-title"><span>Advertising</span></h3> <div class="textwidget"> <p> </p> </div> </aside> </div> </div> </div> <footer id="colophon" class="clearfix"> <div class="footer-widgets-wrapper"> <div class="inner-wrap"> <div class="footer-widgets-area clearfix"> <div class="tg-footer-main-widget"> </div> <div class="tg-footer-other-widgets"> <div class="tg-second-footer-widget"> <aside id="text-28" class="widget widget_text clearfix"> <div class="textwidget"> <p style="text-align: center;"><strong>shyza.ru</strong></p> <p style="text-align: center;"><strong>Tips. Answers. Solutions</strong></p> <p style="text-align: center;"><strong>2022</strong></p> <p style="text-align: center;" <span style="display:scroll;"> <noindex><a target="blank" href="https://www.facebook.com/sharer/sharer.php?u=https://shyza.ru/perspektivy-razvitiya-poiskovyh-sistem-poiskovye-sistemy-interneta-kak.html"><img class="animate1" src="https://shyza.ru/wp-content/uploads/2017/01/logo_facebook.png" width="40" height="40" title="Facebook group" alt="Post to Facebook" loading=lazy loading=lazy></a></noindex> </span><span style="display:scroll;"> <noindex><a target="blank" href="https://vk.com/share.php?url=https://shyza.ru/perspektivy-razvitiya-poiskovyh-sistem-poiskovye-sistemy-interneta-kak.html"><img class="animate1" src="https://shyza.ru/wp-content/uploads/2017/01/logo_vk.png" width="40" height="40" title="Vkontakte community" alt="Post to VKontakte" loading=lazy loading=lazy></a></noindex></span><span style="display:scroll;"> <noindex><a target="blank" href="https://connect.ok.ru/offer?url=https://shyza.ru/perspektivy-razvitiya-poiskovyh-sistem-poiskovye-sistemy-interneta-kak.html"><img class="animate1" src="https://shyza.ru/wp-content/uploads/2017/01/logo_odnoklass.png" width="40" height="40" title="Group in Odnoklassniki" alt="Publish to Odnoklassniki" loading=lazy loading=lazy></a></noindex></span> <span style="display:scroll;"> </span> </div> </aside> </div> <div class="tg-third-footer-widget"> <aside id="text-32" class="widget widget_text clearfix"> <div class="textwidget"> <div class="footer-socket-left-sectoin"> </div> </div> </div> </div></div> </aside> </div> <div class="tg-fourth-footer-widget"> </div> </div> </div> </div> </div> <div class="footer-socket-wrapper clearfix"> <div class="inner-wrap"> <div class="footer-socket-area"> <div class="footer-socket-right-section"> </div> </div> </div> </div> </footer> <a href="#masthead" id="scroll-up"><i class="fa fa-chevron-up"></i></a> </div> <script type='text/javascript' src='https://shyza.ru/wp-content/plugins/contact-form-7/includes/js/scripts.js?ver=5.0.1'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/themes/colormag/js/jquery.bxslider.min.js?ver=4.1.2'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/themes/colormag/js/colormag-slider-setting.js?ver=4.9.5'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/themes/colormag/js/navigation.js?ver=4.9.5'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/themes/colormag/js/sticky/jquery.sticky.js?ver=20150309'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/themes/colormag/js/sticky/sticky-setting.js?ver=20150309'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/themes/colormag/js/magnific-popup/jquery.magnific-popup.min.js?ver=20150310'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/themes/colormag/js/magnific-popup/image-popup-setting.js?ver=20150310'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/themes/colormag/js/fitvids/jquery.fitvids.js?ver=20150311'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/themes/colormag/js/fitvids/fitvids-setting.js?ver=20150311'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/themes/colormag/js/post-format.js?ver=20150422'></script> <script type='text/javascript' src='/wp-includes/js/wp-embed.min.js?ver=4.9.5'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/plugins/easy-fancybox/fancybox/jquery.fancybox-1.3.8.min.js?ver=1.6.3'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/plugins/easy-fancybox/js/jquery.easing.min.js?ver=1.4.0'></script> <script type='text/javascript' src='https://shyza.ru/wp-content/plugins/easy-fancybox/js/jquery.mousewheel.min.js?ver=3.1.13'></script> <script type="text/javascript"> jQuery(easy_fancybox_handler); jQuery(document.body).on('post-load', easy_fancybox_handler); jQuery(easy_fancybox_auto); </script> </body> </html>