Top Banner
The World Wide Web The World Wide Web CSCE 101 – Spring 2010 CSCE 101 – Spring 2010
13
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: The World Wide Web CSCE 101 – Spring 2010

The World Wide The World Wide WebWeb

CSCE 101 – Spring 2010CSCE 101 – Spring 2010

Page 2: The World Wide Web CSCE 101 – Spring 2010

The Internet and the WebThe Internet and the Web

Internet: A worldwide computer network that connects hundreds of thousands of smaller networks. “The mother of all networks”.

World Wide Web: The interconnected system of servers that support multimedia documents, i.e. the multimedia part of the Internet.

Timeline: Early 1960s: introduction of the network concept 1970: ARPANET, scholarly-aimed networks

62 computers in 1974 500 computers in 1983 28,000 computers in 1987

1975: Ethernet developed by Robert Metcalf 1980: TCP/IP 1982: The first computer virus, Elk Cloner, spread via Apple II floppy

disks 1989: Web invented by Tim Berners-Lee 1990: First Web browser based on HTML developed by Berners-Lee Early 1990s: Anderseen developed the first graphical browser

(Mosaic) 1993: The White House launches its Web site

Page 3: The World Wide Web CSCE 101 – Spring 2010

Web BrowsersWeb Browsers

Web Browser: is a software application for retrieving, presenting, and is a software application for retrieving, presenting, and traversing information resources on the World Wide Web. traversing information resources on the World Wide Web. Software that enables users to view Web pages and to jump from one page to another, e.g. IE, Mozilla Firefox, Safari, etc.

Which browser is better? Why?

Web Page: A document on the Web that can include multimedia data

Web Site: A collection of related Web pages usually designed or controlled by the same individual or company. Generally shares a common domain name.

Practical Browser Tools: Status Bar: security info, page load progress Favorites (bookmarks) View Source: view the code of a Web page Tools Internet Options history, temporary Internet files, home page, auto

complete, security settings, programs, etc.

Page 4: The World Wide Web CSCE 101 – Spring 2010

Domain NamesDomain Names

URL (Uniform Resource Locator): The human-friendly address of a Web page

String of characters that points to a piece of information on the Internet Syntax: protocol://domain name/directory/file, e.g.

http://www.sc.edu/~hanczary/lectures.html The domain name includes the domain type and sometimes a country extension Have you ever mistyped a URL and gone to a website you weren’t expecting?

ICANN non-profit organization was established to regulate human-friendly domain names

DNS (Domain Name System): A distributed set of servers storing domain information in hierarchical fashion

DNS provides the mapping between the IP addresses and URLs of Internet sites DNS requires static IP addresses DNS poisoning Domain names must be registered to ensure uniqueness, registration fees vary,

cybersquatting

Page 5: The World Wide Web CSCE 101 – Spring 2010

Domain NamesDomain Names Main Domain Extension Types Suffix Extension DescriptionsMain Domain Extension Types Suffix Extension Descriptions

.com (.com (.com.commercial) is a generic top-level domain. It was one of the original top-level domains, and has mercial) is a generic top-level domain. It was one of the original top-level domains, and has grown to be the largest in use. grown to be the largest in use.

.org (.org (.org.organization) is a generic top-level domain, and is mostly associated with non-profit organizations. It anization) is a generic top-level domain, and is mostly associated with non-profit organizations. It is also used in the charitable field, and used by the open-source movement. Government sites and Political is also used in the charitable field, and used by the open-source movement. Government sites and Political parties in the US have domain names ending in .orgparties in the US have domain names ending in .org

.net (.net (.net.network) is a generic top-level domain and is one of the original top-level domains. Initially intended work) is a generic top-level domain and is one of the original top-level domains. Initially intended to be used only for network providers (such as Internet service providers). It is still popular with network to be used only for network providers (such as Internet service providers). It is still popular with network operators, it is often treated as a second .com. It is currently the third most popular top-level domain.operators, it is often treated as a second .com. It is currently the third most popular top-level domain.

.edu (.edu (.edu.education) is the generic top-level domain for educational institutions, primarily those in the United cation) is the generic top-level domain for educational institutions, primarily those in the United States. One of the first top-level domains, .edu was originally intended for educational institutions States. One of the first top-level domains, .edu was originally intended for educational institutions anywhere in the world. Only post-secondary institutions that are accredited by an agency on the U.S. anywhere in the world. Only post-secondary institutions that are accredited by an agency on the U.S. Department of Education's list of nationally recognized accrediting agencies are eligible to apply for Department of Education's list of nationally recognized accrediting agencies are eligible to apply for a .edu domain.a .edu domain.

.info (.info (.info.information) is a generic top-level domain intended for informative website's, although its use is not rmation) is a generic top-level domain intended for informative website's, although its use is not restricted. It is an unrestricted domain, meaning that anyone can obtain a second-level domain restricted. It is an unrestricted domain, meaning that anyone can obtain a second-level domain under .info. The .info was one of many extension(s) that was meant to take the pressure off the under .info. The .info was one of many extension(s) that was meant to take the pressure off the overcrowded .com domain.overcrowded .com domain.

.gov (.gov (.gov.government) a generic top-level domain used by government entities in the United States. Other ernment) a generic top-level domain used by government entities in the United States. Other countries typically use a second-level domain for this purpose, e.g., .gov.uk for the United Kingdom. Since countries typically use a second-level domain for this purpose, e.g., .gov.uk for the United Kingdom. Since the United States controls the .gov Top Level Domain, it would be impossible for another country to create the United States controls the .gov Top Level Domain, it would be impossible for another country to create a domain ending in .gov.a domain ending in .gov.

.biz (business) the name is a .biz (business) the name is a phonetic spellingphonetic spelling of the first syllable of "business." A generic top-level of the first syllable of "business." A generic top-level domain to be used by businesses. It was created due to the demand for good domain names available in domain to be used by businesses. It was created due to the demand for good domain names available in the .com top-level domain, and to provide an alternative to businesses whose preferred .com domain name the .com top-level domain, and to provide an alternative to businesses whose preferred .com domain name which had already been registered by another. which had already been registered by another.

Page 6: The World Wide Web CSCE 101 – Spring 2010

CookiesCookies

Little text files left on your hard disk by some websites you visit Cookies are data not programs, they do not generate pop-ups or

behave like viruses Can include your log-in name and browser preferences

Can be convenient But they can be used to gather information about you and your

browsing habits “Third party” cookies: used by advertising companies to track users

across multiple sites People share machines

session-id-time 954242000 amazon.com/

session-id 002-4135256-7625846 amazon.com/

x-main eKQIfwnxuF7qtmX52x6VWAXh@Ih6Uo5H amazon.com/

ubid-main 077-9263437-9645324 amazon.com/

Sample Amazon.com cookie

Page 7: The World Wide Web CSCE 101 – Spring 2010

E-mailE-mail

E-mail Software and Carriers: Free Web-based e-mail services (e.g. Yahoo Mail) or bundled with

software (e.g. MS Outlook)

E-mail Privacy: How did they find my e-mail address? Can anyone read the content of my messages? What happens to my deleted e-mail messages? What are my rights? - None Basically Can anything be done to enhance e-mail privacy?

E-mail Security: Dangers of attachments and HTML graphics

Useful E-mail Tools: Mailing lists, filters (rules)

Page 8: The World Wide Web CSCE 101 – Spring 2010

Deciphering SpamDeciphering Spam

Spam: Unsolicited e-mail in the form of advertisements or chain letters. Waste of storage space, processing power, bandwidth, and time E-mail address spoofing, disposable e-mail addresses or anonymous re-mailers, and

zombies are techniques used in spamming Email address harvesting

Motives: Marketing Chain letters & hoaxes Malicious intent Theft of confidential information (e.g. phishing)

Spam Filters: Pattern-based or content based Challenge-based Black & White list based.

Fight back by reporting new spammers to www.abuse.net, www.spamhaus.org, or www.rahul.net/falk

Page 9: The World Wide Web CSCE 101 – Spring 2010

Searching for InformationSearching for Information

Search engine databases are often compiled using software programs called spiders

Spiders crawl through the Web, following links from one page to another

Index the words on that site Indexing techniques Influencing search results (paid, malicious e.g. Google bombs), link rot

If you publish an embarrassing web page and then take it down, is it REALLY gone?

Guidelines to evaluate Web resources Should you trust information you find online? Does the information appear on a professional site maintained by a

professional organization? Does the website authority appear to be legitimate? Is the website objective, complete, and current?

Page 10: The World Wide Web CSCE 101 – Spring 2010

Search EnginesSearch Engines

Types of Search Engines: Human-organized: Documents are categorized by subject-area experts,

smaller databases, more accurate search results, e.g. Open Directory, About Computer-created: Software spiders crawl the web for documents and

categorize pages, larger databases, ranking systems, e.g. Google Hybrid: Combines the two categories above Metasearch or clustering: Direct queries to multiple search engines and

cluster results, e.g. Copernic, Vivisimo, Mamma Topic-specific – e.g. WebMD

Advanced Search Options: Searches for various information formats & types, e.g. image search,

scholarly search Advanced query operators and wild cards

? (e.g. science? means search for the keyword “science” but I am not sure of the spelling)

* (wildcard, e.g. comput* searches for keywords starting with “comput” combined with any word ending)

x AND y (both terms must be present) x OR y (at least one of the terms must be present)

Page 11: The World Wide Web CSCE 101 – Spring 2010

More Web ResourcesMore Web Resources

Wikis: A Wiki is a website on which authoring and editing can be

done by anyone at anytime using a simple browser. Wikipedia, Wikimedia, Wikibooks, Citizendium, etc. Allow individuals to edit content to facilitate Accuracy concerns

Internet Telephony (VoIP): Providers include Vonage, Verizon, Skype, etc. Uses the Internet to make phone calls, videoconference Long-distance calls are either very inexpensive or free Quality, security, and reliability concerns

Page 12: The World Wide Web CSCE 101 – Spring 2010

More Web ResourcesMore Web Resources

Social Networks: MySpace, Facebook, Friendster, Orkut, etc. What are some features of today’s popular social networks? Anti-social networks? Social networks as “study groups”, Courses 2.0 Privacy and safety concerns

Plagiarism in the Internet Age: In a recent survey, 60% of students revealed that they have

cheated in the past Websites offering course material, e.g. coursehero.com,

cheathouse.com Use of portable electronic devices for cheating Services used to combat cheating, e.g. turnitin.com

Page 13: The World Wide Web CSCE 101 – Spring 2010

More Web ResourcesMore Web Resources

Instant messaging (IM) and real-time chat (RTC) software Multi-protocol IM clients (AIM) Web-based IM systems (Forum, chat room)

Podcasting Blogs

Blogger, Xanga, LiveJournal, etc. Microblog, vlog, photoblog, sketchblog, linklog, etc. Blog search engines Blogs and advertising, implications of ad blocking software Do bloggers have the same rights as journalists?

Really Simple Syndication (RSS) FireAnt, i-Fetch, RSS Captor, etc. Built-in Web browser RSS features Search download.com for keyword: “RSS Readers”