Search for terms and phrases (full-text boolean searching)

Search for terms and phrases (full-text boolean searching)




*scroll down to our basic, intermediate, and advanced search connector tables (or download them as PDFs at the bottom of this page) for more in-depth explanations of the above

Searching with connectors such as and or , or w/5 between your search terms is called Terms and Connectors Searching .  This is a good strategy to employ when you need to get more specific with the language you're searching for.  Below are step-by-step instructions as well as some videos to get you started with full text searching.


How much do full text searches really help?  -   Video



In the above video, and the below explanation of it ( How much do full text searches really help? ") , you will see the difference in results from a general retrieval (with no full text search) vs four increasing levels of full text search, and we’ll see how valuable the full text component is

SUMMARY:

  1. Any full text search will cut out 90% of time wasted by methods like using Ctrl+F
  1. Proximity Connectors Wildcards , and Synonyms can get you more than 10x the amount of relevant results in the blink of an eye
  1. It is worth learning at least a little bit about full text searching

How much do full text searches really help?  -   Case Study

In the following case study, we take a basic search through 5 different stages on the simplicity-complexity scale and see the effect it has on our actual results.  This case study follows the exact same example covered in the video above How much do full text searches really help? "   We also advise that you scroll down to view the beginner, intermediate, and advanced search connector tables (or download them as PDFs at the bottom of this article) for full explanations on how connectors function in searches.


CASE STUDY SEARCH, step 1 :  categories but no search terms


  1. We start with a simple search for all Financial Statements from Consumer Products and Industrial Products companies
  2. In the SEDAR Filings dataset, add Industry and Document Category criteria if they are not already added by clicking on the + Add criteria link on the upper left of your screen
    1. In Industry add Consumer Products and Industrial Products
    2. In Document Category add Financial Statements
    3. In  Filing Date  add  Last 2 years
    4. Click Search
  3. You will get 1000 results
    1. These are lots of Financial Statements but we don’t know if they talk about what we’re interested in: net profit and sales growth
    2. We can click into each result and search for these terms document by document similar to using Ctrl+F on a SEDAR filing from sedar.com
    3. This retrieval has no full text search component

 

CASE STUDY SEARCH, step 2 adding search terms and using and

  1. Question:  "Do I need to find any words or phrases in these documents?"
  1. Go back to your search screen and add the keywords net profit and sales growth to your search and click Search again (you can copy and paste the search from here)
    1. You now have two exact phrases (words with spaces in between them are exact phrases) that must appear in the document ("and" indicates that the two phrases need to be present in the document but it does not specify any relationship between them)
    1. Note the number of results you retrieve (at the time of filming of this video it was 74 results)
      1. You will be able to click into each result and automatically see any time that either net profit or else sales growth appear in each filing
      2. The result of 74 means 926 (92.6%) of the results in your first search were not on point and were essentially noise

                                                        i.      This simple search has already saved you 90% of the time you’d spend if you didn’t use any full text search but it is still missing the majority of valid results

      1. This search uses the and connector – and – to specify more than one word or phrase that needs to be found

     

    CASE STUDY SEARCH, step 3 adding a proximity connector

    1. Question:  "Are any of my phrases too narrow or too specific?"
    2. Question:  "Am I looking for search terms to be included in the same discussion but not necessarily to constitute an exact phrase?"
    1. Go back to your search and change the search to net profit and (sales w/5 growth) and click Search again
      1. This allows the phrase sales growth to be replaced by any phrase that includes sales and growth within 5 words of each other
      2. This is non-directional so you will get  sales growth  as well as  growth in fourth quarter sales ,  growth in domestic sales,  growth in international sales , among other variations
    2. Note the number of results you retrieve (at the time of filming it was 182 results) – this is more than double the number of results retrieved in step 2
      1. This search uses the proximity connector – w/n – (where would be replaced by any number between 1 to 5000, in order to limit the maximum distance you will accept between the word or phrase preceding it and the word or phrase following it) to ensure that terms are related or close to each other

     

    CASE STUDY SEARCH,  step 4 :  adding a wildcard

    1. Question:  "Do I need any variations on the exact forms of my search terms such as plurals or tenses?"
    1. Go back to your search and change the search to net profit* and (sales w/5 grow*) and click Search again
      1. This will allow you to get  net profit  as well as  net profits  and also to get  grow  as well as  grows, growing, grown, growth  but it will NOT get  grew  (to get  grew  you need to add it as a synonym in step 5 below, since it does not begin with the same four letters as  grow )
      2. When adding a wildcard to a root word, cut the word at the last character that occurs in all variations of it
        1. increase*  (with the e before the asterisk) will get  increase, increases, increased  but it will NOT get  increasing  
        2. increas*  (with the s before the asterisk) will get  increase, increases, increased,  and  increasing  
    2. Note the number of results you retrieve (at the time of filming it was 327 results) – this is almost 5 times the number of results retrieved in step 2
      1. You will be able to see every time either the phrase  net profit or net profits appear as well as every time that sales appears within 5 words of most forms of growth grow, growing, growth, grown ) in any of these documents
      2. This search uses the wildcard - - to allow for various endings of words

     

    CASE STUDY SEARCH,  step 5 adding synonyms (using or between them)

    1. Question:  "Can any of my search terms be substituted with a different word that might be used in its place?"
    1. Go back to your search and change the search to net profit* and (sales w/5 grow* or increas*) and click Search again
      1. We can see that  net profit  is a term of art that doesn't have synonyms, and the word  sales  doesn't have any synonyms we would accept in this context
      2. The same is not true for  grow , which could be replaced in a sentence by  increase, augment, ramp up - even the non-synonym  double  would get you relevant results ("doubling sales" might be as valid to you as "growing sales") 
      3. Take a moment and think of not only literal synonyms, but any other terms you would accept in the same place, even if they are not literal synonyms
    2. Note the number of results you retrieve (at the time of filming it was 860 results) – this is more than 11 times the number of results retrieved in step 2
      1. You will be able to see every time either the word net profit or net profits appears as well as every time that sales appears either within 5 words of most forms of growth grow, growing, growth, grown , etc.) or else within 5 words of any form of increase (increases, increased, increasing , etc.) in any of these documents
      2. This search uses 2 synonyms, separated by an –  or   –, to allow for various words in the place of “grow”
      3. We are only using one synonym above ( increas* , but a more complete search might be something like  net profit* and (sales w/5 grow* or grew or increas* or improv* or augment*)  - this would get even more results


    CASE STUDY, CONCLUSIONS:


    1. Even the most basic search, the s earch in step 2 above, cuts out 90% of the noise you would have to wade through on SEDAR or any method relying on Ctrl+F to find words in documents
    2. The most specific search,  the s earch in step  above, uses (1) a proximity connector , (2) wildcards , and (3) synonyms (separated by an or ), finds more than 11 times as many relevant documents as the basic search,  the s earch in step  above, does
    3. By sorting results by rank you will start with the most relevant results (those that have the search terms most frequently occurring and most tightly clustered together) first so you don’t have to look all of the 860 results but can just look at the highest/best matches among them
     

    To search with connectors

    1.      Intro Level : Make sure you are at least familiar with the first 5 connectors in the chart below ([space], AND, OR, AND NOT, *)

    a.      Look at the 6 th connector (w/n) and consider whether or not it would be useful for you to use in your searching.  If the answer is no, you will not need to know more about terms and connector searching than beginner level

    2.      Basic Level : Look at the entire list of connectors in the Basic Terms and Connectors Searching chart below

    a.      If you find you have no unanswered questions and you are not interested in knowing more, you will not need more than basic level

    3.      Intermediate and Advanced Levels :  Look at the second chart below – Intermediate and Advanced Terms and Connectors Searching

    a.      The first 3 examples are intermediate and the last 3 are advanced applications of Terms and Connectors Searching

                                                        i.      Understanding and employing intermediate and advanced terms and connectors searching gives you a lot more power over what you look at and allows you to cut out a lot of noise in your searching

    Table #1: Basic Terms and Connectors Searching   

    (download this table as a PDF at the bottom of this page, video explanation follows table)
     

    Connector

    Example

    Retrieves 

    Highlights

    [space]

    region of incorporation

    Documents that contain the exact same phrase searched for

     

    EXCEPTION: some phrases do require quotation marks in order to be recognized. 

     

    IMPORTANT: see “” (quotes) connector below

     

    The exact phrase region of incorporation

    AND

    warrant AND consideration

    Documents that contain both terms in them

    Both terms anywhere in the document, regardless of proximity to each other

     

    OR

    warrant OR consideration

    Documents that contain either term OR both terms in them

    Either term anywhere in the document

     

    AND NOT

    warrant AND NOT consideration

    Documents that contain one term but must not contain the other

     

    Only the term warrant and must not contain the term consideration

    *

    warrant*

    Documents that contain any term that begins with a specified string of characters

     

    Any term that starts with " warrant " including warrant s, warrant ed, warrant y, warrant ies, etc.

     

    w/n

    warrant w/10 consideration

    Documents that contain one term within a certain number of words of the other term

     

    Allows for any combination of words in between these two terms so it is not looking for any exact  phrase in particular. 

     

    It is looking for terms that form part of an idea, conversation, or topic.

    Either term whenever it appears within a certain number of words of the other term

    pre/n

    warrant pre/10 consideration

    Documents that contain one term preceding the other term by a certain number of words (or less than that number of words)

    Both warrant and consideration so long as warrant precedes consideration by within 10 words or less. 

     

    If warrant is 11 words before consideration , neither term will be highlighted

     

    NOT w/n

    warrant NOT w/10 consideration

    Documents that have at least one instance of a term appearing in them without that term being within a certain distance of another specified term

    Warrant whenever it is not within 10 words of consideration 

     

    Warrant may also appear within 10 words of consideration in this document but this instance will not be highlighted

     

    xfirstword

    warrant w/10 xfirstword

    Specifies the location of the first word appearing in the document. 

     

    When combined with w/n, finds documents that have a term appearing within a certain number of words of the first word in the document

     

    Every instance of " warrant " that appears within 10 words of the first word in the document. 

    ""

    (quotes)

    "warranties and representations"


    "incorporated or deemed to be incorporated"


    not limited to"

    Documents that have the exact phrase that was searched for, including recognizing and or , and not as normal terms and not as connectors

    Unnecessary for most phrase searching. 

     

    Only necessary when the exact phrase contains a word that is normally a connector such as "and", "or", "not"

     

    Any time and or , and not are enclosed within “” they will be treated as regular terms to be searched for and will cease being connectors in that phrase

    %

    wa%rrant

    Documents that have words that are somewhat similar to warrant

    Will find misspellings of warrant such as warant and warrrant

     


    Video #1:  Basic Terms and Connectors Searching  




    Table #2: Intermediate and Advanced Terms and Connectors Searching   
    (download this table as a PDF at the bottom of this page, video explanation follows table)

    Level

    Search

    Retrieves

    Highlights

    Intermediate

    (warrant and consideration) or common shares

    Documents that:

     

       1) Have both warrant and consideration

     

       2) But don't necessarily contain common shares

     

    OR ELSE documents that:

     

       1) Contain common shares

     

       2) But don't necessarily contain either warrant or consideration

     

    Highlights any occurrences of warrant consideration , or common shares that are found in the relationships specified in the search

     

    Warrant will only be highlighted if the term consideration is in the document

    Intermediate

    warrant and (consideration or common shares)

    Documents that:

     

       1) Contain warrant

     

       2) And also contain EITHER consideration or common shares

     

    Highlights any occurrences of warrant consideration , or common shares that are found in the relationships specified in the search

     

     

     

     

     

     

     

     

     

     

    Intermediate

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

    Intermediate

    (continued)

    (warrant and consideration) w/10 common shares

    Documents that:

     

       1) Contain warrant within 10 words of common shares

     

       2) AS LONG AS the same document ALSO contains consideration within 10 words of common shares  

    For warrant to be highlighted it must be:

     

    within 10 words of common shares , and

     

    consideration must ALSO be within 10 words of common shares or else warrant will not be highlighted

     

    * it does not matter how far apart warrant and consideration are, although they cannot logically be more than 20 words away from each other given that each term is limited to within 10 words of common shares

     

    For consideration to be highlighted it must be:

     

    within 10 words of common shares , and

     

    warrant must ALSO be within 10 words of common shares or else consideration will not be highlighted

     

    * it does not matter how far apart warrant and consideration are, although they cannot logically be more than 20 words away from each other given that each term is limited to within 10 words of common shares

     

    For common shares to be highlighted it must be:

     

    within 10 words of warrant , as well as be

     

    within 10 words of consideration or else common shares will not be highlighted

     

    * it does not matter how far apart warrant and consideration are, although they cannot logically be more than 20 words away from each other given that each term is limited to within 10 words of common shares

     

     

     

     

     

     

     

     

     

    Advanced

    common shares w/20 warrant w/10 consideration

    Documents that

     

    1) Contain warrant within 20 words of common shares

     

    2) Contain warrant within 10 words of consideration

     

       3) ALSO contain consideration within 10 words of common shares  

     

    In this search, common shares , the first term typed, is an anchor term and all proximity connectors that follow in that string apply as a distance from this anchor term common shares 

     

    There is a second stipulation that warrant needs to also be within 10 words of consideration in addition to being within 10 words of the anchor term common shares 

    To be highlighted –

     

       1) Common shares must be:

     

    * within 20 words of warrant, as well as be

     

    * within 10 words of consideration

     

       2) Warrant must be:

     

    within 20 words of anchor term common shares, as well as be

     

    within 10 words of consideration

     

       3) Consideration must be:

     

    * within 10 words of the anchor term common shares, as well as be

     

    * within 10 words of warrant

     

    Advanced

    common shares w/20 (warrant w/10 consideration)

    Documents that

     

    1) Contain warrant within 10 words of consideration

     

    2) Contain warrant OR ELSE consideration within 20 words of common shares  

     

       3) consideration can be any distance from common shares so long as the two above conditions are met

     

     

    To be highlighted –

     

       1) Common shares must be:

     

    * within 20 words of warrant, OR ELSE be

     

    * within 20 words of consideration

     

       2) Warrant must be:

     

    within 10 words of anchor term consideration, as well as be

     

    within 20 words of common shares ONLY IF consideration is not within 20 words of common shares

     

       3) Consideration must be:

     

    * within 10 words of warrant, as well as be

     

    within 20 words of the anchor term common shares ONLY IF warrant is not within 20 words of common shares 

     


    4)      only warrant OR consideration needs to be within 20 words of the anchor term common shares

     

    Advanced

    Common shares w/10 warrant w/15 consideration w/20 collectively

    Documents that contain:

     

       1) Warrant within 10 words of common shares

     

       2) Consideration within 15 words of common shares

     

       3) Collectively within 20 words of common shares

     

    Due to the string of consecutive proximity connectors (not broken by an AND, OR, or NOT), any returned document will also need to contain:

     

       1) Warrant within 15 words of consideration

     

       2) Consideration within 20 words of collectively

     

    In this search, common shares , the first term typed, is an anchor term and all proximity connectors that follow in that string apply as a distance specified relative to this anchor term common shares

     

    There is a second stipulation that each term ALSO needs to be within a specified proximity to the term following it, based on the proximity connector used (w/10, w/15, w20) 

    To be highlighted –

     

       1) Common shares must be:

     

    * within 10 words of warrant, as well as be

     

    * within 15 words of consideration, and also be

     

    * within 20 words of collectively

     

       2) Warrant must be:

     

    within 10 words of anchor term common shares, as well as be

     

    within 15 words of consideration

     

       3) Consideration must be:

     

    * within 15 words of the anchor term common shares, as well as be

     

    * within 15 words of warrant, and also be

     

    * within 20 words of collectively

     

       4) Collectively must be:

     

    within 20 words of the anchor term common shares, as well as be

     

    within 15 words of consideration


    Video #2:  Intermediate Terms and Connectors Searching



    Video #3:  Advanced Terms and Connectors Searching



     


      • Related Articles

      • Narrow searches using search criteria

        Search criteria allow you add parameters to your search beyond the full text language. These parameters include controls such as: Searching by industry, by ticker symbol, by revenue range, etc. Below are step by step instructions as well as a video ...
      • Search panel

        Video - Running Your First Search Search Panel Here are the features of your search panel. Items 1 , 3, 4, and 7 below are all you need to run your first search! Choose a dataset – contains all the datasets within which you can execute searches Saved ...
      • Search term suggestions from your previous searches

        Avantis hosts a robust collection of all your previously searched full text search terms as a Quick History in the full text search box in both the Search Pane and the Document Viewer. This is distinct from the Search History, which contains full ...
      • Search results grid

        Video - Running Your First Search Results Grid Here are the features of your result grid. Being familiar with these will help you get the most out of the time spent analyzing and organizing your results. Clicking on any result will open the document ...
      • STRATEGY: Searching for contracts, agreements, standalone or appended documents – Best practices

        Contracts and agreements can be filed, or even misfiled, under several different document categories. The same applies to many other document types including, but not limited to, articles of incorporation, company charters, fairness opinions, interim ...