1、Information Storage and Retrieval,540-671 Boolean IR model,Boolean logic,Boolean query is a query in which the search terms (operand) are combined with the Boolean operators such as NOT, OR, AND and parentheses etc.,Boolean logic,1 AND requires that both search terms in this connect be present in th
2、e retrieved document set,e.g. A AND B. A=computer, B=archive. 2 OR requires that at least one of the two search terms in this connects be present in the retrieved document set, e.g. A OR B.A=computer, B=medicine.,Boolean logic,3 NOT requires that the specified search term be absent in the retrieved
3、document set, e.g. A AND NOT B.A=computer, B=archive. 4 Combination of logic operations,Boolean logic,5 Order of precedence in a Boolean operation, 5.1 Role of parentheses: expression within parentheses should be executed first.5.2 The operator order of precedence: “NOT” “AND” “OR” 5.3 At the same l
4、evel operation starts from left to right.,Boolean logic,6 DeMorgan Law:NOT (A AND B) = NOT A OR NOT BNOT (A OR B) = (NOT A) AND (NOT B),Boolean logic,Other laws:A AND (C OR B) = A AND C OR A AND BA AND B = B AND AA OR B = B OR ANOT(NOT A)= A Note: A, B, C can be another logic expression.,Boolean Log
5、ic,Please complete following questions:1 (A*B+C*D) = (A*B)*(C*D) = (A+B)*(C+D) = A*C+A*D+B*C+B*D,Boolean Logic,Please simplify following questions:1 A*(B+(C*D)=2 (A*(C+D)=,Process a Boolean query,Why does a machine can not understand a regular Boolean query?1. A machine can only read a series of cod
6、es sequentially in nature2. A machine does not understand level relation within a Boolean query3. A machine does not understand logic operator precedence,Processing logic expression: reverse Poland Expression,Processing logic expression: reverse Poland Expression,2 Stack: stack is a list, or a stora
7、ge area, data entering or exiting follows the rule: “first come and last go“ . Difference between an operand and an operator.,Processing logic expression: Reverse Poland Expression,Algorithm description,Note: Original Boolean expression will be processed from left to right in the original expression
8、 area. a. If a current item in the original expression area is an operand, it directly enters the Poland area; b. If a current item in the original expression area is “(“, it directly enters the stack area;,Rules for operation,c. if the current item in the expression area is “)“, the current item in
9、 the stack exits and enters the Poland area.This process continues until the current item in the stack area is “(“. Then both “(“ and “)“ exit (Either of them does not enter any area, they are simply discarded from the storage areas);,Rules for operation,If a current item in the expression area is a
10、n operator like “-”, “+” , or “*”, compare its weight with that of the current item in the stack. When its weight is larger than that of the current item in the stack, it enters the stack. Otherwise (when its weight is smaller than or equal to that of the current element in the stack), the current i
11、tem in the stack exits and enters the Poland area;Step d repeats until its weight is larger than that of the current item in the stack.,Rules for operation,e. If the current item in the expression area is the end mark “.“, all elements in the stack exit the stack and enter the Poland area one by one
12、 until the stack is empty. f. The weight of the initial empty stack is 0.,Example,Interpretation of the Reverse Poland expression,5 The system reads the reverse Poland expression from left to right. When it reads an operand, then it keeps reading. When it reads an operator, the system stops and chec
13、ks backward and uses one/two neighboring operand(s) to make a corresponding logic operation. The result of this operation corresponds to a set of documents like a normal keyword, it will be used for future operations like a normal operand.,Explanation of the Reverse Poland method,There are two kinds
14、 of operators: a One-operand-based (like NOT); b Two-operand-based(like AND, OR). When system checks backward,if the current operator is one-operand-based, just takes one neighboring operand to make a logic operation. If current operator is two-operand-based, takes two neighboring operands to make a
15、 logic operation.,Interpretation of the Reverse Poland expression,System continues to process all items in the reverse Poland expression until final result is generated.,Interpretation of the Reverse Poland expression,Limitations of Boolean logic,1 Importance of keywords cannot be specified in terms
16、 of weights. 2 Retrieved documents cannot be ranked based on their relevance to query. 3 Users have to follow the strict logic syntax and to have to understand meaning of each operator.,Limitations of Boolean logic,4 Operator NOT is very sensitive to retrieval results. 5 Boolean logic is not equal t
17、o the Boolean query. 6 It lacks effective feedback mechanisms to support and adjust a search.,Project,Please transfer the following Boolean query into the reverse Polish format, draw all diagrams for the transfer, and check your results by identifying the operation orders in a diagram.,Project,Q= A+B*(D+E*F)*-(G+P*O)+Y*X).Notes: (1). + stands for the logic OR; (2). * stands for the logic AND;(3). stands for the logic NOT;(4). Here A, B, D, E, F, G, O, P, Y, and X are keywords.,