AltaVista Search Intranet is a fast, convenient tool for searching the web pages and documents on your organization's intranet. This document explains the basic search syntax as well as keywords and other features that you can use to get the most out of your search.
Both the simple and advanced search interfaces are equally powerful and flexible, and there is not that much difference in how difficult they are to use.
The main advantages of the simple search interface are
For example, suppose you want to find a recipe for muffins that includes either apples or pears, but ideally would contain both fruits. You could enter the series of words apple pear muffin recipe. If any document contains all four words, automatic ranking places that document at the top of your results list. Documents containing only some of the words would be next, and documents containing only one of the words would be ranked last.
The advanced search interface requires a more precise, logical syntax which, although it is more exacting, also gives you more control over the results of your search. Using the apple pear muffin recipe example, suppose you decide that you do not want to see any documents unless they contain at least the words muffin and recipe. In advanced search syntax, a more precise rendition of the simple query would be (apple OR pear) AND muffin AND recipe.
You can optionally enter your own ranking rules in the advanced search interface. If you do not enter any ranking rules, AltaVista returns the results in no particular order.
Although the two interfaces offer basically the same features, advanced search does offer some capabilities that are not available with the simple search:
For additional information on using the advanced search interface, see Doing an An Advanced Search.
Both the simple and advanced search functions use the same syntax rules regarding phrasing, case sensitivity, and finding related words.
For example, AltaVista Search interprets and indexes HAL5000, 60258, www, http, and EasierSaidThanDone all as single words, because they are continuous strings of characters, surrounded by characters that are neither letters nor digits. The software indexes all words that it finds in a web document, regardless of whether the word exists in a dictionary or is spelled correctly.
You can use AltaVista Search to find phrases, or groups of related words that appear next to each other. To indicate a phrase in a search query, enclose the words with double quotes. Phrasing ensures that AltaVista Search finds the words together, instead of looking for separate instances of each word individually. For example, to look for the phrase personnel policies, type
If you did not use the double quotes, AltaVista Search would find instances of "personnel" alone and "policies" alone, as well as any instances where the two words happen to appear together. Enclosing the words in quotes indicates that you want to find only instances of both words together.
AltaVista Search ignores punctuation except to interpret it as a separator for words. Placing punctuation or special characters between each word, with no spaces between the characters and the words, is also a way to indicate a phrase. As an example of when punctuation might be useful in indicating a phrase, consider searching for a telephone number. Entering
is easier than entering "1 800 555 1212", which is an equally acceptable syntax, but is less natural. Hyphenated words, such as CD-ROM, also automatically form a phrase because of the hyphen.
Normally, however, using double quotes to indicate a phrase is recommended over the use of punctuation between words, because some special characters have additional meaning:
Case sensitivity of a search is based on the case in which you enter your query.
For example, if you enter turkey in the query field, AltaVista Search will find all occurrences of the word turkey, including those spelled TUrkey, TURKEY, turkey, and so forth.
For example, if you enter Turkey in the query field, AltaVista Search will find all occurrences of Turkey with initial capitalization only. It will not return documents containing the words TURKEY or turkey.
AltaVista Search supports exact-match searches for characters in the ISO Latin-1 character set. That is, you can enter a word containing an accent or other diacritical mark, and AltaVista Search will find only documents with the accented spelling of the word.
For example, if you search for the French word él;éphant, AltaVista Search will find only documents containing an exact match for the French spelling of the word.
Entering a word with mixed case and an accent, (for example, Él;éphant) would produce only results that match the word in terms of both case and accent.
If you omit accents and other diacritical marks from a search query, AltaVista Search finds documents containing words both with and without the special marks. Although this feature might produce some irrelevant results for users doing an English language search, it enables users to enter queries for non-English words even when they do not have international support on their keyboards.
To support searching for special characters without their diacritical marks, AltaVista search makes a mapping to the closest possible plain character or combination of characters. The software then indexes words in both forms: with special characters as they appear, and also with special characters replaced by the mappings. The following table illustrates the special characters and their mappings:
|Á Â À Å Ã Ä||A||á â à å ã ä||a|
|É Ê È Ë||E||é ê è ë||e|
|Í Î Ì Ï||I||í î ì ï||i|
|Ó Ô Ò Ø Õ Ö||O||ó ô ò ø õ ö||o|
|Ú Û Ù Ü||U||ú û ù ü||u|
You can use the asterisk wildcard notation ( * ) to search for a group of words that contain the same pattern. This is convenient for finding derivatives and spelling variants of the same word.
For example, to look for the word sing and any derivatives, such as singer, singers, and singing, enter sing* in the query field. Searching for cantalo* will produce matches for cantaloup, cantaloupe, cantalope, and their plurals.
Ignored inte*: 4292323
The example message indicates that there are more than four million instances in the index of words starting with "inte". Consequently, AltaVista Search does not return any results, because the query is not specific enough to be useful.
Simple searches use general syntax rules regarding phrasing, case sensitivity, and use of the asterisk (*) as a wildcard character. In addition, two operators can help to narrow a simple search:
|This Operator||Does This|
|+||includes only documents containing all specified words or phrases in the search results|
|-||excludes documents containing the specified word or phrase from the search results|
Specify the operator in front of the word that you want to include or exclude, with no spaces between the operator and the word.
To find the documents most relevant to your needs, construct your query as precisely as you can.
Example: Querying for sandals leather footwear instead of just one of those words increases the chance of finding documents about leather sandals.
Example: bicycle "for sale" finds documents that contain both the phrase for sale and the word bicycle.
Example: quilt* finds the words quilts, quilter, quilting, and quilted.
Example: noir +film -"pinot noir" finds documents containing both noir and film but not the phrase pinot noir.
AltaVista ranks the results of a search based on a score that includes these criteria:
If you are not happy with the documents that AltaVista ranks first as the result of a search, you might need to narrow the scope of your search.
Advanced queries use the same general syntax rules as simple queries, but they offer more options for refining a search based on operators and expressions. With the advanced query feature, you have more control over the results of your search, and you also have to be more precise in order to get the results that you want.
|AND||&||Finds only documents containing all of the specified words or phrases.|
|OR|||||Finds documents containing at least one of the specified words or phrases.|
|NOT||!||Excludes documents containing the specified word or phrase.|
|NEAR||~||Finds documents containing both specified words or phrases within 10 words of each other.|
You can enter the keywords in all uppercase or all lowercase. Using uppercase is a convenient way to distinguish the keywords from words that are part of your search. Entering symbols instead of keywords is also an option, although it can make the query more cryptic and less conversational.
The following examples illustrate how to use operators and parentheses to construct an advanced search query.
Note that the syntax vegetable NOT broccoli (without the AND) returns a syntax error. When NOT appears in a position other than the beginning of a query, use AND to connect the NOT portion with the rest of the query. (OR NOT is also valid syntax, but would probably return more results than would be useful in most cases).
Unlike with simple searches, AltaVista returns the results of an advanced query in no particular order, unless you specify ranking rules. An example of when you might not want to rank results is when you are doing a search of all web pages that contain links to your home page, and you want to display the results as a count only. For a count, only the number, and not the ordering of the results, is significant.
In most cases, though, you will want to filter the results of your search so that the most useful documents appear at the top of the list. To rank results, enter words or phrases in the Ranking field. Use spaces to separate multiple words or phrases. You can use the words that are a part of your query, or you can enter new words as an additional way to refine your search. For example, you could further narrow a search for COBOL AND programming by entering advanced and experienced in the Ranking field.
Ranking also limits your ability to view the search results to the top 200 documents. Because ranking naturally gives priority to documents that best meet the search criteria, 200 documents should be a sufficient number to provide you with the most useful information. For details about the factors that influence ranking, see How the Results are Ordered.
You can confine your search to a particular time period by entering dates in the Start Date: and End Date: fields at the bottom of the advanced search screen. AltaVista Search finds matches for the specified time frame based on the time that the web page was last modified. Note that the software gets this information from the web server where the page exists; it may not always be accurate.
Enter the date in the format dd/mmm/yy where dd is the day of the month, mmm is an abbreviation for the name of the month, and yy is the last two digits of the year. Be sure to use the name of the month instead of a number; this eliminates ambiguity between date formats in different countries. For example:
If you omit the year, the search assumes the date is in the current year. If you omit both the year and the month and specify only numbers for days, the search assumes the current month and year. For example, entering a Start date of 09/jan indicates that you want documents dated no earlier than January 9 of the current year. Entering a start date of 09 indicates that you want documents dated no earlier than the ninth day of the current month in the current year.
On both the simple search and advanced search screens, you can choose to display the results of a search in compact or detailed form. Compact form displays all information on a single line, thereby allowing you to view more results without scrolling. Detailed form displays four sequential lines about each topic, providing slightly more information.
Both the simple and advanced search interfaces support the use of keywords to restrict your searches to pages that meet specific criteria regarding the structure and contents of a web page. Using keywords, you can search based on a URL or portion of a URL, or based on the links, art, text, and coding that a web page contains. With keywords, you can do useful things such as
To search based on keywords, enter a query in the format keyword:search-criteria where keyword is any of a list of special items for which AltaVista can search, and search-criteria is the string or condition that you want to match.
You must enter the keyword in lowercase, followed immediately by a colon. The conventions for specifying a phrase in the search criteria are the same as for specifying a phrase in a regular query: the most convenient method is to enclose the phrase in double quotes.
Note that, in the Advanced search interface, you can enter a logical expression (containing any combination of the AND, OR, NEAR and NOT operators) as the search criteria. For example, if you want to find a web page whose title contains both the words spreadsheet and training, you could enter a query in the form
title:(spreadsheet AND training)
For additional information on advanced search operators, see Doing an Advanced Search.
The following table describes the keywords that you can use in Web page searches:
|anchor:text||Finds pages that contain the specified word or phrase in the text of a hyperlink.|
|applet:class||Finds pages that contain a Java applet of the specified class.|
|domain:domainname||Finds pages with the specified word or phrase in the domain name of the web server where the page exists (the rightmost portion of an Internet hostname is the domain name).|
|host:name||Finds pages with the specified word or phrase in the hostname of the web server where the page exists.|
|image:filename||Finds pages that have an image tag with the specified filename.|
|link:URLtext||Finds pages that contain at least one link to a page with the specified text in its URL.|
|text:text||Finds pages that contain the specified text in any part of the page other than an image tag, link, or URL.|
|title:text||Finds pages that contain the specified word or phrase in the title.|
|url:text||Finds pages that contain the specified word or phrase in the URL.|
The url, host, and domain keywords all serve a similar purpose in that they search for URLs based on a specific portion of the URL itself, or on the hostname or domain name where the web page exists.
The link and anchor keywords are similar in that they both look for information about jumps. The link keyword looks for text in a URL that is the target of a jump (for example, http://www.abc.org/help.html), whereas the anchor keyword looks for the actual text of a hyperlink as users would see it on a web page (for example, click here).
The text and title tags both search for the contents of a document itself. The text keyword finds any visible text (other than tags, links, and URLs) within a document, whereas the title keyword restricts the search to text that the document's author coded as part of the <title> tag. The title is what appears in the window banner of your web browser. The title keyword can be a good way to hone your search to only the most significant pages about a topic, assuming the pages were titled intelligently.
If you own a web page that is new or substantially changed, and you want to add your page to the index immediately, click the Add URL link at the bottom of the main search page.
In the box labeled URL Submission to AltaVista, type the URL of the page that you want to add to the index. Be sure that the spelling and case of the URL are correct. When you are satisfied, click the Submit button.
AltaVista Search attempts to fetch a copy of your page immediately. If it fails due to network congestion or because your server is not currently available, it does not try again and you will need to submit the URL again later. If the fetch succeeds, the software informs you of that.
Depending on the size of the index and other demands on the system running the AltaVista software, it may take a while for your page to appear in the index.
You can use META tags in your web page to
For example, suppose you have a web page that advertises a pet grooming service. Alta Vista search automatically indexes all words in the page. However, you might think of a few alternate words or phrases that describe your service but do not appear in the page. Use the META tag and specify name="keywords" to add these phrases to the index and increase the chance that users will find your page:
<META name="keywords" content="pet grooming, coat beautification, split ends">
The description META tag allows you to specify what you want in the abstract that appears as the result of a search. For your pet grooming page, you might want a short promotional phrase like the following:
<META name="description" content="We specialize in grooming poodles.">
AltaVista Search indexes all words in the description tag in addition to those in the keyword tag. So in this example, users would be able to find your page by searching for "poodles" as well as for "pet grooming," "coat beautification," or "split ends."
Instead of displaying the first several lines of the web page, the search result would show the text of the description tag instead:
Note that description and keyword tags can be a maximum of 1,024 characters long.
AltaVista Search Intranet software obeys the Standard for Robot Exclusion (SRE). When the page gatherer visits a web server, it examines the contents of a file called /robots.txt to determine whether the page gatherer is allowed to access web pages on the server. If you want to prevent AltaVista Search Intranet from gathering your page, you or the web server administrator can specify AltaVista Intranet in the User-Agent field of an entry in /robots.txt. For example, the following portion of a robots.txt file prevents the AltaVista Search Intranet product from gathering pages whose URLs start with /personnel:
User-agent: AltaVista Intranet Disallow: /personnel/
Specifying either AltaVista or Intranet alone in the User-agent field will also work. If you use both words, be sure to separate them with a space and not a tab character.
Note that web server administrators can also prevent page gathering by specifying in the User-agent field of robots.txt the complete text string, or any portion of the text, specified by the AltaVista Search Intranet administrator in the HTTP User Agent field during configuration of the index. This information may appear in the web server's access log. If it does not, you can ask the AltaVista Search Intranet administrator for it.
This section provides answers to common questions that arise with using AltaVista Search Intranet.
Try to refine your query by specifying more words or more precise words. Also, for multiple-word queries, be sure to use the appropriate syntax.
To indicate a phrase, enclose multiple words in quotes. Phrasing ensures that AltaVista Search looks for occurrences of the query words next to each other, rather than individual occurrences of each word. For example, a search for "Virgin Islands" is more useful than the same query without the quotes, which would produce documents containing the word Virgin and documents containing the word Islands, as well as documents that happen to contain both words next to each other.
What is actually happening here is that the query word occurs too frequently relative to the total number of words in the index. The number following the Ignored: message is the number of occurrences of the word in the index. If a query word occurs more frequently than a certain percentage of the total number of indexed words, AltaVista Search considers it a "noise" word and the software ignores it.
The default threshold for considering a word useful is if its frequency is less than .25 percent of the total number of words in the index. For example, in a group of 100,000 words being indexed, if there are more than 250 instances of the same word, that word is considered to be too common and is ignored in a simple query (or as ranking criteria in an advanced query).
One solution is to enter a more specific query. As a practical example, suppose you search for the word plan. In an intranet index with many documents containing many instances of the word plan, this search is likely to produce results that are too numerous to be useful anyway. A query such as abc +project +plan would be more precise and would not produce as many irrelevant results.
If you still want to get results from a query that is ignored in the simple interface, you can use the advanced search interface. In the Selection Criteria box, specify all query words. In the Results Ranking Criteria box, specify any words except those that were ignored.
[ Table of Contents]
| D-NET Home