Set Theory
Sets

A set is a collection of objects. These objects are called the elements or members of the set. Objects can be anything: numbers, people, words, Web sites, other sets, etc.

Examples of sets include

  • songs on my iPod
  • students enrolled in CS-103
  • computers on the Internet
  • single-digit positive integers
  • integers that are not divisible by 6

We often use variable names to refer to sets and objects. Usually, upper-case letters refer to sets and lower-case letters refer to objects. If x is an element of set A, then we write xA. If x is not a member of A, then we write xA.

A set is completely defined by its elements. This implies two key properties of sets:

  • Order of elements in a set is irrelevant. There is no distinction of "first element" or "second element" etc.
  • Repetition of elements in a set is irrelevant. For example, no matter how many times 3 is named as a single-digit positive integer, the set of single-digit positive integers does not change. 

Two sets A and B are defined to be equal when they have exactly the same elements: every element of A is an element of B and every element of B is an element of A. In this case, we write A = B.

We also allow for an empty set, which is the set with no elements.

 
Explicit Notation for Sets

The simplest way to describe a set is to list its elements inside a pair of curly braces. This is known as explicit set notation.

For example:

  • {1,2} denotes the set whose only elements are 1 and 2.
  • {2,1} = {1,2}
  • {2,2,1,2,1} = {1,2}
  • {1,2,3,4,5,6,7,8,9} denotes the set of single-digit positive integers.

The simplest possible example of explicit set notation is { }, which denotes the empty set.

 
Cardinality

The cardinality of a set A is the number of elements in A, which is written as |A|. Note that this vertical-bar notation looks the same as absolute value notation, but the meaning of cardinality is different from absolute value. In particular, absolute value operates on numbers (e.g., |-3| = 3) while cardinality operates on sets (e.g., |{-3}| = 1).

Examples of cardinality:

  • |{3,4}| = 2
  • |{5,6,5,6,5,6,5,1,1,1}| = |{1,5,6}| = 3
  • |{ }| = 0. The empty set has no elements.
  • |{{1,2},{3,4}}| = 2. In this case the two elements of {{1,2},{3,4}} are themselves sets: {1,2} and {3,4}.
We can also consider the cardinality of infinite sets, a topic made cool by Georg Cantor. His work is beyond the scope of this introduction.
 
Subsets

Given two sets A and B, we say that A is a subset of B if every element of A is also an element of B. We write AB to denote that A is a subset of B.

For example:

  • {1} ⊆ {1,2,3}
  • {2,2,1,3,2,1,3,2} ⊆ {1,2,3}
  • { } ⊆ {1,2,3}
  • Let B be the set of registered students at Boston University, and C be the set of registered students in CS-103. Then CB.
Think about why it is true that { } ⊆ {1,2,3}. There is a somewhat subtle mathematical principle at work in this statement. That same principle is invoked by this statement: "I am friends with every 100-feet-tall person who has ever lived."
 
Venn Diagrams

A Venn diagram is an illustration that represents one or more sets and the relationships between them. For example, suppose A = {1,2,3,4,5} and B = {3,4,5,6,7,8}. Below is a Venn diagram of A and B:

Venn diagram

Usually, Venn diagrams are drawn more simply, without listing the elements of the sets:

Venn diagram

With different examples of sets, the Venn diagram will look different. The circles might not overlap at all, or one circle might be completely contained inside another.

 
Union and Intersection

The union of two sets A and B is the set consisting of any object that is (1) an element of A, or (2) an element of B, or (3) an element of both A and B. The union of A and B is written as AB.

Examples:

  • {1} ∪ {2,3} = {1,2,3}
  • {2} ∪ {2,3} = {2,3}
  • { } ∪ {4,6} = {4,6}
  • A = {1,2,3,4,5}. B = {3,4,5,6,7,8}.  AB = {1,2,3,4,5,6,7,8}.

The intersection of A and B is the set of all objects that are elements of both A and B. The intersection of A and B is written as AB.

Examples:

  • {2} ∩ {2,3} = {2}
  • {3,2} ∩ {2,3} = {2,3}
  • {2,3} ∩ {4,6} = { }
  • A = {1,2,3,4,5}. B = {3,4,5,6,7,8}.  AB = {3,4,5}.

The above examples involving sets A and B can be drawn as the following Venn diagram:

Venn diagram of union and intersection
 
Ordered Lists

If we consider "the songs on my iPod" as a familiar example of a set, then "a playlist on my iPod" is the corresponding example of an ordered list. Key properties of ordered lists include:

  • Order of elements in an ordered list is relevant.
  • Repetition of elements in an ordered list is relevant.

Explicit notion for writing ordered lists is similar to that for sets; however, we use parentheses instead of curly braces. For example:

  • (1,2) is an ordered list that is different than (2,1)
  • (1,1) is an ordered list that is different than (1)

From now on, expressions with parentheses "( , , , )" and expressions with curly braces "{ , , , }" have very distinct meanings!

The length of an ordered list is, intuitively speaking, just like the number of songs in a playlist. For example:

  • The length of (1,10) is 2
  • The length of (1,1,2,2) is 4

An ordered pair is an ordered list of length two. We will use ordered pairs very often in our study of the Web; we will use longer ordered lists less often. Also, we will never attempt to define the cardinality of an ordered list. Sets have cardinality. Ordered lists have length.

Any element in an ordered list can itself be an ordered list or a set. Any object in a set can be a set or an ordered list. For example:

  • {(1,2),(3,4)} is a set of two elements, each of which is an ordered pair. One of the elements of the set is (1,2) and the other element of the set is (3,4).
  • ({1,2},{3},{2,4}) is an ordered list of sets. The first element of the list is the set {1,2}; the second element of the list is the set {3}; the third element of the list is the set {2,4}.
 
Implicit Notation for Sets

Explicit set notation is convenient for small sets but impractical for large sets, such as the set of all Web pages. Implicit set notation (also called set builder notation) is a more sophisticated way to describe sets with mathematical rigor.

There are two parts to any implicit set expression:

  1. A dummy variable. Very often we will use x as our dummy variable.
  2. A logical true/false statement. Usually, this statement is an expression that uses the dummy variable.

These two parts are enclosed within curly braces and separated by a colon (":"). For example:

  • A = {x : x is a single-digit positive integer}
  • B = {x : x is an even element of set A}

The above two sets can also be written explicitly:

  • A = {1,2,3,4,5,6,7,8,9}
  • B = {2,4,6,8}

How to interpret implicit set notation

Suppose P(x) is any logical true/false statement that uses x in some way. (For example, P(x) = "x is a single-digit positive integer"; or P(x) = "x is a Web page".) With that, we can write implicit set notation that looks like this: {x : P(x)}. The set defined by {x : P(x)} is the set of all objects for which logical statement P is true. For example:

  1. Suppose P(x) = "x is a Web page"; then
  2. {x: P(x)} = {x: x is a Web page}
  3. http://brucehoppe.com/img/index.html ∈ {x: x is a Web page}
  4. Bruce Hoppe ∉ {x: x is a Web page}
  5. |{x: x is a Web page}| is approximately 50 billion

Statement #3 above is true because "http://brucehoppe.com/img/index.html is a Web page" is true. Similarly, statement #4 above is true because "Bruce Hoppe is a Web page" is false.

 
Logical Expressions

Grossly simplifying an entire branch of mathematics (logic) to suit our Web-focused ends, we note three kinds of logical true/false statements, often referred to as expressions:

Constant expressions such as "True" and "False" are the simplest possible logical true/false statements, each with meaning that is evident. The truth or falsehood of a constant expression is always the same; it does not depend on the value of any dummy variable.

Simple expressions are like simple clauses in English, which have one verb. Usually they involve a dummy variable. For example:

  • x = 3 is a simple true/false expression. Its truth or falsehood depends on x.
  • Suppose A is the set of all Web pages. Then "x ∈ A" is a simple true/false expression.
  • Suppose N is a number with a specific value (e.g., 3). Then "x ≤ N" is a simple true/false expression.

As illustrated above, possible"verbs" in mathematical true/false expressions include the following: "=", "∈", and "≤".

Compound expressions are like compound sentences in English. They combine expressions (simple or compound) with conjunctions like "and" and "or". For example:

  • x is an integer and 0 ≤ x and x ≤ 3
  • x is a song on my iPod or x is a song I heard on the radio this year
  • x is a Web page with a hyperlink pointing to http://brucehoppe.com/img/index.html and x is not authored by Bruce Hoppe
 
Compound expressions with "or"

Suppose we have two true/false statements: P, Q. We can use the mathematical conjunction "or" to combine these into a single compound expression "P or Q". The truth or falsehood of "P or Q" is defined as follows:

Mathematical definition of "or"
as illustrated by "P or Q"

Value of Q
Q is True
Q is False
Value of P
P is True
"P or Q" is True
"P or Q" is True
P is False
"P or Q" is True
"P or Q" is False
 
Compound expressions with "and"

Suppose we have two true/false statements: P, Q. We can use the mathematical conjunction "and" to combine these into a single compound expression "P and Q". The truth or falsehood of "P and Q" is defined as follows:

Mathematical definition of "and"
as illustrated by "P and Q"

Value of Q
Q is True
Q is False
Value of P
P is True
"P and Q" is True
"P and Q" is False
P is False
"P and Q" is False
"P and Q" is False
 
Union and intersection defined formally

Using implicit set notation and compound logical expressions, we can restate the definitions of union and intersection more formally than we did when we introduced the terms.

For any two sets A and B:

  • A ∪ B = {x: x ∈ A or x ∈ B}
  • A ∩ B = {x: x ∈ A and x ∈ B}

The above definitions mathematically restate the familiar Venn diagram:

Venn diagram of union and intersection
 
Similarity of Sets

We measure the similarity of two non-empty sets A and B with the Jaccard Index:

J(A,B) = |A ∩ B| / |A ∪ B|.

If A and B are disjoint then J(A,B)=0. If A=B then J(A,B)=1. (Assuming A and B are non-empty.)

Suppose A = {1,2,3,4,5,6}. Below we use J(A,B) to measure the similarity of set A with 5 different example sets B:

Set B Venn diagram of sets A and B Jaccard index J(A,B)
B = {7,8,9,10,11,12}
jaccard-1

J(A,B) =

|{1,2,3,4,5,6} ∩ {7,8,9,10,11,12}|

|{1,2,3,4,5,6} ∪ {7,8,9,10,11,12}|

= 0/12;

Not at all similar

B = {4,5,6,7,8,9}
j-2

J(A,B) =

|{1,2,3,4,5,6} ∩ {4,5,6,7,8,9}|


|{1,2,3,4,5,6} ∪ {4,5,6,7,8,9}|

= 3/9;

Somewhat similar

B = {2,3,4,5,6,7}
j-3

J(A,B) = 5/7

Quite similar

B = {4}
j-4

J(A,B) = 1/6

B = {4,5,6,7, ... 25}
j-5
J(A,B) = 3/25
 


Joomla Templates by Joomlashack