The Articles From
Wikipedia, the free encyclopedia:
The Surface Web (also
called the Visible Web, Indexed Web, Indexable
Web or Lightnet)[1] is
the portion of the World
Wide Web that is readily available to the general public and
searchable with standard web
search engines. It is the opposite of the deep
web.
According to one source, as of
June 14, 2015, Google's index of the surface web contains about 14.5
billion pages.
The deep web,[1] invisible
web,[2] or hidden
web[3] are
parts of the World Wide Web whose contents are
not indexed by standard web
search engines for any reason.[4] The
opposite term to the deep web is the surface
web, which is accessible to anyone using the Internet.[5] Computer
scientist Michael K. Bergman is credited with coining the term deep
web in 2001 as a search indexing term.[6]
The content of the deep web is
hidden behind HTTP forms,[7][8] and
includes many very common uses such as web
mail, online
banking, and services that users must pay for, and which is protected
by a paywall, such as video
on demand, some online magazines and newspapers, and many more.
Content of the deep web can be
located and accessed by a direct URL or IP
address, and may require password or other security access past the
public website page.
Terminology[edit]
The first conflation of
the terms "deep web" and "dark web"
came about in 2009 when the deep web search terminology was discussed alongside
illegal activities taking place on the Freenet darknet.[9]
Since then, the use in
the Silk Road's
media reporting, many[10][11] people
and media outlets, have taken to using Deep Web synonymously with
the dark web or darknet,
a comparison many reject as inaccurate[12] and
consequently is an ongoing source of confusion.[13] Wired reporters Kim Zetter[14] and Andy Greenberg[15] recommend
the terms be used in distinct fashions. While the deep web is reference to any
site that cannot be accessed through a traditional search engine, the dark web
is a portion of the deep web that has been intentionally hidden and is
inaccessible through standard browsers and methods.[16][17][18][19][20]
Non-indexed content[edit]
Bergman, in a paper on the deep
web published in The Journal of Electronic Publishing, mentioned
that Jill Ellsworth used the term Invisible Web in 1994 to
refer to websites that
were not registered with any search engine.[21] Bergman
cited a January 1996 article by Frank Garcia:[22]
It would be a site that's possibly
reasonably designed, but they didn't bother to register it with any of the
search engines. So, no one can find them! You're hidden. I call that the
invisible Web.
Another early use of the
term Invisible Web was by Bruce Mount and Matthew B. Koll of
Personal Library Software, in a description of the #1 Deep Web tool found in a
December 1996 press release.[23]
The first use of the specific
term deep web, now generally accepted, occurred in the
aforementioned 2001 Bergman study.[21]
Indexing methods[edit]
Methods which prevent web pages
from being indexed by traditional search engines may be categorized as one or
more of the following:
1. Contextual web: pages
with content varying for different access contexts (e.g., ranges of client IP
addresses or previous navigation sequence).
2. Dynamic content: dynamic pages which
are returned in response to a submitted query or accessed only through a form,
especially if open-domain input elements (such as text fields) are used; such
fields are hard to navigate without domain knowledge.
3.Limited access content: sites
that limit access to their pages in a technical way (e.g., using the Robots
Exclusion Standard or CAPTCHAs, or
no-store directive which prohibit search engines from browsing them and
creating cached copies).[24]
4.
Non-HTML/text content:
textual content encoded in multimedia (image or video) files or specific file formats not
handled by search engines.
5 Private web: sites
that require registration and login (password-protected resources).
6.
Scripted content: pages
that are only accessible through links produced by JavaScript as
well as content dynamically downloaded from Web servers via Flash or Ajaxsolutions.
7. Software: certain content is intentionally
hidden from the regular Internet, accessible only with special software, such
as Tor, I2P, or other darknet software. For example, Tor allows users to
access websites using the .onion server
address anonymously, hiding their IP address.
8.
Unlinked content: pages
which are not linked to by other pages, which may prevent web crawling programs
from accessing the content. This content is referred to as pages without backlinks (also
known as inlinks). Also, search engines do not always detect all backlinks from
searched web pages.
9.Web archives: Web
archival services such as the Wayback Machine enable
users to see archived versions of web pages across time, including websites
which have become inaccessible, and are not indexed by search engines such as
Google.[25]
Content
types[edit]
Non
Indexed Content
While it is not always possible
to directly discover a specific web server's content so that it may be indexed,
a site potentially can be accessed indirectly (due to computer vulnerabilities).
To discover content on the web,
search engines use web
crawlers that follow hyperlinks through known protocol
virtual port numbers. This technique is ideal for discovering
content on the surface web but is often ineffective at finding deep web
content. For example, these crawlers do not attempt to find dynamic pages that
are the result of database queries due to the indeterminate number of queries
that are possible.[6] It
has been noted that this can be (partially) overcome by providing links to
query results, but this could unintentionally inflate the popularity for a
member of the deep web.
DeepPeep, Intute, Deep Web Technologies, Scirus, and Ahmia.fi are
a few search engines that have accessed the deep web. Intute ran out of funding
and is now a temporary static archive as of July 2011.[26] Scirus
retired near the end of January 2013.[27]
Researchers have been exploring
how the deep web can be crawled in an automatic fashion, including content that
can be accessed only by special software such as Tor. In 2001, Sriram Raghavan and Hector Garcia-Molina (Stanford
Computer Science Department, Stanford University)[28][29] presented
an architectural model for a hidden-Web crawler that used key terms provided by
users or collected from the query interfaces to query a Web form and crawl the
Deep Web content. Alexandros Ntoulas, Petros Zerfos, and Junghoo Cho of UCLA created
a hidden-Web crawler that automatically generated meaningful queries to issue
against search forms.[30] Several
form query languages (e.g., DEQUEL[31]) have been proposed that, besides issuing
a query, also allow extraction of structured data from result pages. Another
effort is DeepPeep, a project of the University of Utah sponsored by the National Science Foundation, which
gathered hidden-web sources (web forms) in different domains based on novel
focused crawler techniques.[32][33]
Commercial search engines have
begun exploring alternative methods to crawl the deep web. The Sitemap
Protocol (first developed, and introduced by Google in 2005)
and OAI-PMH are
mechanisms that allow search engines and other interested parties to discover
deep web resources on particular web servers. Both mechanisms allow web servers
to advertise the URLs that are accessible on them, thereby allowing automatic
discovery of resources that are not directly linked to the surface web.
Google's deep web surfacing system computes submissions for each HTML form and adds
the resulting HTML pages into the Google search engine index. The surfaced
results account for a thousand queries per second to deep web content.[34] In
this system, the pre-computation of submissions is done using three algorithms:
1.
selecting input values for text search inputs that accept
keywords,
2.
identifying inputs which accept only values of a specific type
(e.g., date), and
3.
selecting a small number of input combinations that generate
URLs suitable for inclusion into the Web search index.
In 2008, to facilitate users
of Tor hidden services in their access and
search of a hidden .onion suffix, Aaron
Swartz designed Tor2web—a
proxy application able to provide access by means of common web browsers.[35] Using
this application, deep web links appear as a random string of letters followed
by the .onion TLD.
Dark web
From Wikipedia, the free encyclopedia
This article
is about darknet websites. For the part of the Internet not accessible by
traditional web search engines, see Deep web.
The dark web is the World Wide Web content
that exists on darknets, overlay networks that
use the Internet but require specific software, configurations or
authorization to access.[1][2] The dark
web forms a small part of the deep web, the part of the
Web not indexed by web search engines, although sometimes the term deep web is
mistakenly used to refer specifically to the dark web.[3][4][5][6][7]
The darknets which constitute the dark web include small, friend-to-friend peer-to-peer networks,
as well as large, popular networks like Tor, Freenet, I2P and Riffle operated by public organizations and individuals.
Users of the dark web refer to the regular web as Clearnet due
to its unencrypted nature.[8] The Tor
dark web may be referred to as onionland,[9] a reference to the network's top-level domain suffix .onion and the
traffic anonymization technique of onion routing.
Terminology[edit]
The dark web has often been confused with the deep web, which refer to
the parts of the web not indexed (searchable) by search engines. This confusion
dates back to at least 2009.[10] Since
then, especially in reporting on Silk Road, the two terms have often been conflated,[11][12][13] despite
recommendations that they should be distinguished.[5][14][15][16]
Definition[edit]
Main
article: Darknet
Darknet websites are accessible only through networks such
as Tor ("The Onion Routing" project) and I2P ("Invisible Internet Project").[17] Tor
browser and Tor-accessible sites are widely used among the darknet users and
can be identified by the domain ".onion".[18] While Tor
focuses on providing anonymous access to the Internet, I2P specializes on
allowing anonymous hosting of websites.[19] Identities
and locations of darknet users stay anonymous and cannot be tracked due to the
layered encryption system. The darknet encryption technology routes users' data
through a large number of intermediate servers, which protects the users'
identity and guarantees anonymity. The transmitted information can be decrypted
only by a subsequent node in the scheme, which leads to the exit node. The
complicated system makes it almost impossible to reproduce the node path and
decrypt the information layer by layer.[20] Due to
the high level of encryption, websites are not able to track geolocation and IP
of their users, and users are not able to get this information about the host.
Thus, communication between darknet users is highly encrypted allowing users to
talk, blog, and share files confidentially.[21]
The darknet is also used for illegal activity such as illegal
trade, forums, and media exchange for pedophiles and terrorists.[22] At the
same time traditional websites have created alternative accessibility for the
Tor browser in efforts to connect with their users. ProPublica, for example,
launched a new version of its website available exclusively to Tor users.[23]
Content[edit]
Category
|
Percentage
|
0.4
|
|
1.4
|
|
2.2
|
|
New
(not yet indexed) |
2.2
|
2.2
|
|
2.5
|
|
2.5
|
|
2.75
|
|
2.75
|
|
3.5
|
|
4.25
|
|
4.25
|
|
4.5
|
|
4.75
|
|
5.2
|
|
5.2
|
|
5.2
|
|
5.7
|
|
6.2
|
|
9
|
|
9
|
|
15.4
|
Category
|
% of total
|
% of active
|
Violence
|
0.3
|
0.6
|
Arms
|
0.8
|
1.5
|
Illicit Social
|
1.2
|
2.4
|
Hacking
|
1.8
|
3.5
|
Illicit links
|
2.3
|
4.3
|
Illicit pornography
|
2.3
|
4.5
|
Extremism
|
2.7
|
5.1
|
Illicit Other
|
3.8
|
7.3
|
Illicit Finance
|
6.3
|
12
|
Illicit Drugs
|
8.1
|
15.5
|
Non-illicit+Unknown
|
22.6
|
43.2
|
Illicit total
|
29.7
|
56.8
|
Inactive
|
47.7
|
|
Active
|
52.3
|
A December 2014 study by Gareth Owen from the University of Portsmouth found that the most commonly hosted type of content
on Tor was child pornography, followed by black markets, while the
individual sites with the highest traffic were dedicated to botnet operations (see attached metric).[27] Many whistleblowing sites maintain a presence[28] as well as political discussion
forums.[29] Sites
associated with Bitcoin, fraud related services and mail order services
are some of the most prolific.[27] To
counter the trend of controversial content, the artist collective Cybertwee
held a bake sale on an onion site.[30]
In July 2017, Roger Dingledine, one of
the three founders of the Tor Project, said that Facebook is the biggest hidden service. The Dark
Web comprises only 3% of the traffic in the Tor network.[31]
A more recent February 2016 study from researchers at King's College London gives the following breakdown of content by an
alternative category set, highlighting the illicit use of .onion services.[32][33]
Botnets[edit]
Botnets are often
structured with their command
and control servers based on a
censorship-resistant hidden service, creating a large amount of bot-related
traffic.[27][34]
Bitcoin
services[edit]
Bitcoin services
such as tumblers are often available on Tor, and some – such as Grams – offer
darknet market integration.[35][36] A
research study undertaken by Jean-Loup Richet, a research fellow at ESSEC, and carried out with the United
Nations Office on Drugs and Crime,
highlighted new trends in the use of Bitcoin tumblers for money launderingpurposes.
A common approach was to use a digital currency exchanger service which converted Bitcoin into an online game
currency (such as gold coins in World of Warcraft) that
will later be converted back into money.[37][38]
Darknet
markets[edit]
Main
article: Darknet market
Commercial darknet markets, which mediate transactions for illegal drugs[39] and other
goods, attracted significant media coverage starting with the popularity
of Silk Road and Diabolus Market[40] and its
subsequent seizure by legal authorities.[41]Other markets
sell software exploits[42] and
weapons.[43] Examination
of price differences in Dark web markets versus prices in real life or over the
World Wide Web have been attempted as well as studies in the quality of goods
received over the Dark web. One such study was performed on Evolution, one of
the most popular crypto-markets active from January 2013 to March 2015.[44] Although
it found the digital information, such as concealment methods and shipping
country, "seems accurate", the study uncovered issues with the
quality of illegal drugs sold in Evolution, stating that, "... the illicit
drugs purity is found to be different from the information indicated on their
respective listings."[44] Less is
known about consumer motivations for accessing these marketplaces and factors
associated with their use.[45]
Hacking
groups and services[edit]
Many hackers sell their services either individually or as a
part of groups.[46] Such
groups include xDedic, hackforum, Trojanforge, Mazafaka, dark0de and
the TheRealDeal darknet market.[47] Some have
been known to track and extort apparent
pedophiles.[48] Cyber
crimes and hacking services for financial institutions and banks have also been
offered over the Dark web.[49] Attempts
to monitor this activity have been made through various government and private
organizations, and an examination of the tools used can be found in the
Procedia Computer Science journal.[50] Use of
Internet-scale DNS Distributed Reflection Denial of Service (DRDoS) attacks
have also been made through leveraging the Dark Web.[51] There are
many scam .onion sites also present which end up giving tools for download that
are infected with trojan horses or backdoors.
Fraud
services[edit]
There are numerous carding forums, PayPal and Bitcoin trading websites as well as fraud and
counterfeiting services.[52] Many such
sites are scams themselves.[53]
Hoaxes
and unverified content[edit]
Main article: Hoax
There are reports of crowdfunded assassinations and hitmen for hire,[43][54] however,
these are believed to be exclusively scams.[55][56] The
creator of Silk Road, Ross Ulbricht, was
arrested by Homeland Security investigations (HSI) for his site and allegedly
hiring a hitman to kill six people, although the charges were later dropped.[57][58]
There is an urban legend that
one can find live murder on the dark web. The term "Red Room"
has been coined based on the Japanese animation and urban legend of the same
name. However, the evidence points
toward all reported instances being hoaxes.[59][60]
On June 25, 2015, a creepy indie game Sad Satan was
reviewed by Youtubers Obscure Horror Corner which they claimed
to have found via the dark web. Various inconsistencies in the channel's
reporting cast doubt on the reported version of events.[61] There are
several websites which analyze and monitor the deep web and dark web for threat
intelligence, for example Sixgill.[62]
Phishing
and scams[edit]
Phishing via cloned
websites and other scam sites are
numerous,[63][64] with darknet markets often advertised with fraudulent
URLs.[65][66]
Puzzles[edit]
Puzzles such as Cicada 3301 and
successors will sometimes use hidden services in order to more anonymously
provide clues, often increasing speculation as to the identity of their
creators.[67]
Illegal
pornography[edit]
There is regular law enforcement action
against sites distributing child pornography[68][69] – often
via compromising the site by distributing malware to the users.[70][71] Sites use
complex systems of guides, forums and community regulation.[72] Other
content includes sexualised torture and
killing of animals[73] and revenge porn.[74]
Terrorism[edit]
There are at least some real and fraudulent websites claiming to
be used by ISIL (ISIS), including a fake one seized in Operation Onymous.[75] In the
wake of the November 2015 Paris attacks an actual such site was hacked by an Anonymous affiliated
hacker group GhostSec and
replaced with an advert for Prozac.[76] The Rawti Shax Islamist group was found to be
operating on the dark web at one time.[77]
Social
media[edit]
Within the dark web, there exist emerging social media platforms
similar to those on the World Wide Web. Facebook and other traditional social media
platforms have begun to make dark-web versions of their websites to address
problems associated with the traditional platforms and to continue their
service in all areas of the World Wide Web.[23]
Commentary[edit]
Although much of the dark web is innocuous, some prosecutors and
government agencies, among others, are concerned that it is a haven for criminal activity.[78] Specialist
news sites such as DeepDotWeb[79][80] and All Things Vice[81] provide
news coverage and practical information about dark web sites and
services. The Hidden Wiki and its mirrors and forks hold some of the largest directories of
content at any given time.
Popular sources of dark web .onion links
include Pastebin, YouTube, Twitter, Reddit and other Internet forums.[82] Specialist
companies with Darksum and Recorded Future track
dark web cybercrime goings on for law enforcement purposes.[83] In 2015
it was announced that Interpol now offers
a dedicated dark web training program featuring technical information on
Tor, cybersecurity and
simulated darknet market take downs.[84]
In October 2013 the UK's National Crime Agency and GCHQ announced the formation of a 'Joint Operations Cell' to focus on cybercrime.[85] In
November 2015 this team would be tasked with tackling child exploitation on the
dark web as well as other cybercrime.[86]
In March 2017 the Congressional
Research Service released an
extensive report on the dark web, noting the changing dynamic of how information
is accessed and presented on it; characterized by the unknown, it is of
increasing interest to researchers, law enforcement, and policymakers.[87]
In August 2017, according to reportage, cybersecurity firms
which specialize in monitoring and researching the dark web on behalf of banks
and retailers routinely share their findings with the FBI and with other law enforcement agencies "when
possible and necessary" regarding illegal content. The Russian-speaking
underground offering a crime-as-a-service model is regarded as being
particularly robust.[88]
Journalism[edit]
Many individual journalists,
alternative news organizations, and educators or researchers are influential in their
writing and speaking of the Darknet, and making its use clear to the general
public.
Jamie
Bartlett[edit]
Jamie Bartlett is a journalist and tech blogger for The Telegraph and Director of the Centre for the Analysis of
Social Media for Demos in conjunction with The University of Sussex. In his book, The Dark Net,[89] Barlett depicts the world of the
Darknet and its implications for human behavior in different context. For
example, the book opens with the story of a young girl who seeks positive
feedback to build her self-esteem by appearing naked online. She is eventually
traced on social media sites where her friends and family were inundated with
naked pictures of her. This story highlights the variety of human interactions the Darknet allows for, but also reminds the reader
how participation in a overlay network like
the Darknet is rarely in complete separation from the larger Web. Bartlett's
main objective is an exploration of the Darknet and its implication for
society. He explores different sub-cultures, some with
positive implications for society and some with negative.[90]
Bartlett gave a TEDTalk in June 2015 further examining the
subject.[91] His talk,
entitled "How the mysterious Darknet is going mainstream", introduces
the idea behind the Darknet to the audience, followed by a walkthrough example
of one of its websites called the Silk
Road. He points out the familiarity of
webpage design similar to consumer sites used in the larger commercial Web.
Bartlett then presents examples of how operating in an uncertain, high-risk
market like those in the Darknet actually breeds innovation that he believes
can be applied to all markets in the future. As he points out, because vendors
are always thinking of new ways to get around and protect themselves, the
Darknet has become more decentralized, more customer friendly, harder to
censor, and more innovative. As our societies are increasingly searching for
more ways to retain privacy online, such changes as those occurring in the
Darknet are not only innovative, but could be beneficial to commercial online
websites and markets.
Other
media[edit]
Traditional media and news channels like ABC News have also featured articles examining the Darknet.[92] Vanity Fair magazine published an article in October 2016
entitled "The Other Internet". The article discusses the rise of the
Dark Net and mentions that the stakes have become high in a lawless digital
wilderness. It mentions that vulnerability is a weakness in a network's
defenses. Other topics include the e-commerce versions
of conventional black markets, cyberweaponry from TheRealDeal, and role of
operations security.[93]