Our course introduces Web science. It has no prerequisites and has been used by nontechnical undergraduates at Boston University since 2006. Our curriculum is guided by the following passage adapted from "Web Science: An Interdisciplinary Approach to understanding the Web," by James Hendler, Nigel Shadbolt, Wendy Hall, and Tim BernersLee: Web science, an emerging interdisciplinary field, takes the Web as its primary object of study. This study incorporates both the social interactions enabled by the Web's design and the applications that support them. The Web is often studied at the micro scale, as an infrastructure of protocols, programming languages, and applications. However, it is the interaction of human beings creating, linking, and consuming information that generates the Web's behavior as emergent properties at the macro scale. These properties often generate surprising properties that require new analytic methods to be understood. For example, when Mosaic, the first popular Web browser, was released publicly in 1992, the number of users quickly grew by several orders of magnitude, with more than a million downloads in the first year. The wide deployment of Mosaic led to a need for a way to find relevant material on the growing Web, and thus search became an important application, and later an industry, in its own right. The enormous success of search engines has inevitably yielded techniques to game the algorithms (an unexpected result) to improve search rank, leading, in turn, to the development of better search technologies to defeat the gaming. More recent macroscale examples include photosharing on Flickr, videouploading on YouTube, and socialnetworking sites like mySpace and Facebook. The essence of Web science is to understand how to design systems to produce the effects we want. The best we can do today is design and build in the micro, hoping for the best; but how do we know if we've built in the right functionality to ensure the desired macroscale effects? How do we predict other side effects and the emergent properties of the macro? Further, as the success or failure of a particular Web technology may involve aspects of social interaction among users, understanding the Web requires more than a simple analysis of technological issues but also of the social dynamic of perhaps millions of users. Given the breadth of the Web and its inherently multiuser (social) nature, its science is necessarily interdisciplinary, involving at least mathematics, computer science, sociology, psychology, and economics.

Micro vs. Macro: themes of Web science 
Four important themes of Web Science are  Micro: an individual acts
 Macro: the world responds (or not) to an individual's action
 Synthetic: something is created to produce a desired result
 Analytic: laws are stated to explain observed phenomena
We focus on these themes as they apply to Web builders  people who contribute links and other content to the Web:  Synthetic
 Analytic
 Micro
 An individual builds a Web site to produce a desired result.
 (We do not speak to this quadrant.)
 Macro
 "The world" builds a Web site to produce a desired result.  Laws are stated to explain largescale Web phenomena.

Some Web builders consider themselves Web developers; others consider themselves bloggers; others merely post an occasional comment on someone else's blog or discussion forum. We say "Web builder" to encompass the full spectrum of people who contribute links and other content to the Web. Our lab curriculum provides an informal handson approach to the task of building a Web site. Our Search and Share pages help Web builders leverage collectively engineered resources (such as WordPress). The formal chapters of the Study page (which you are now reading) explain large scale Web phenomena; they also explain the Amazon recommendation algorithm and the Google PageRank algorithm. The sociology, psychology, and economics of this course follow Duncan Watts' Six Degrees, which we recommend as a narrative companion to our own material. Our complete suggested reading list is below. 
Protecting yourself from evildoers Privacy, trust, and ownership 
Basic mathematical foundations of networks: Set Theory  Sets
 Explicit Notation for Sets
 Cardinality
 Subsets
 Venn Diagrams
 Union and Intersection
 Ordered Lists
 Implicit Notation for Sets
 Logical Expressions
 Compound expressions with "or"
 Compound expressions with "and"
 Union and intersection defined formally
 Similarity of Sets
Graph Theory  Graphs
 Undirected and Directed
 Neighborhood and Degree
 Density and Average Degree
 Paths
 Paths in undirected graphs defined formally
 Paths in directed graphs
 Length
 Distance
See also Facebook and Touchgraph 
Hubs, clusters, and other basic structural features of the Web: Network Structure  Connected: a word of many meanings
 Induced Subgraphs
 "Connected" defined formally
 Connected graphs and connected components
 Hubs
 Clusters
 Defining clusters, part one: connected components
 Defining clusters, part two: cliques
 Defining clusters, part three
See also: 
How randomness, homophily, and cumulative advantage shape the Web: Network Dynamics  Limitations of traditional graph theory
 Introduction to network dynamics
 Three models of dynamic graphs
 Random graphs
 Demonstration of random graph dynamics
 Random graph algorithm
 Clusters and homophily
 Triadic closure
 Triadic closure algorithm
 Hubs and cumulative advantage
 Preferential attachment algorithm
See also: All the above are summarized in the following table:  Random graphs
 Clustering
 Centrality
 Realworld phenomenon explained by model
 Giant component forms quickly when E ≅ V.
 Clusters emerge, providing "table of contents" overview.
 Hubs emerge, indicating popularity and/or influence.
 Web sites
 N/A
 Clusty, iBoogie, Grokker
 Google et al
 Sociological force
 Chance
 Homophily
 Cumulative advantage
 Mathematical model
 Random graph algorithm
 Triadic closure algorithm
 Preferential attachment algorithm


Variables, Probability, and ScaleFree Networks 
Understanding that the Web is a scalefree network requires some probability theory: Variables and Probability  Variables in mathematics
 Variables in algorithms
 Random variables
 Discrete vs. continuous variables
 Probability distributions
 Degree distributions
General discussion of scalefree networks:  Six Degrees Chapter 4, pp 101114
 From previous chapter on Network Dynamics
 Hubs and cumulative advantage
 Preferential attachment algorithm

Information and Computation 
Applying fundamental concepts of computer science to the Web Information and computation  Information, computation, and algorithms
 Summation: an example of what computation is
 HTML: an example of what computation is not
 Computing distance, part one: Information diffusion
 Computing distance, part two: Example
 Computing distance, part three: Algorithm
Examples of information diffusion on the Web: See also: 
How to compute personalized recommendations: Collaborative Filtering  "Expert opinions" without the experts
 Delicious: example of CF
 Bookmarks: content of Delicious
 Tuples: content of CF
 Bipartite graphs: structure of CF
 Structural equivalence: computation of CF
 Delicious: algorithmic summary
 The four steps of collaborative filtering

Niches and blockbusters in the world of Web commerce: The Long Tail  Macroanalytic view of collaborative filtering
 Power law revisited
 Niches, megahits, and the neglected middle
 Macroanalytic view of the long tail
 Macro view of Web programming
See also:  The Long Tail, by Chris Anderson. Wired, October 2004.
 Going Long, by John Cassidy. The New Yorker, July 2006.
 Six Degrees Chapter 7, pp 207215: Information Externalities & Market Externalities

How to compute the influence of a Web page: Influence in Networks  Popularity, influence, and centrality
 Introduction to PageRank
 NetRank: a simplified version of PageRank
 Normalization and convergence
 The NetRank algorithm
 Dividing by outdegree: the NR* formula
 The PageRank formula
 The damping factor: PageRank as probability
See also PageRank Explained by Phil Craven 
Competition and Cooperation 
What happens when Web builders seek to increase their influence? Games: Competition and Cooperation  Dynamics of popularity and influence
 PageRank competition
 Doing the right thing
 Mutually assured construction
 Authority, reciprocity, reputation
 Game theory
 Winners' dilemma
See also 
