Wikimedia servers
From Meta, a Wikimedia project coordination wiki
Warning: Do not rely on any information on this page being up-to-date or correct. The information at https://wikitech.leuksman.com/ is likely to be more accurate. |
Other languages: de, eo, es, fr, it, ja, ko, zh
Wikipedia and the other Wikimedia projects are run from several racks full of servers. The names of the Florida servers were primarily based on famous historical encyclopedists, while the Kennisnet and Yahoo! machines have been named after various types of plants. (See the discussion page for name suggestions.) Recently there have been so many new servers that many have just been given numerical names.
See also:
|
Overall system architecture
- Note: much of the following is out of date as configurations change quickly and frequently. Server roles may be more up-to-date.
- About 300 machines in Florida, 26 in Amsterdam, 23 in Yahoo!'s Korean hosting facility.
- The master database servers run MySQL and store the article metadata.
- The databases are subdivided over 3 clusters. See database clusters for more information
- Text is stored on separate database instances running on Apache servers, to avoid consuming expensive database disk space.
- An APC cache is used to store the PHP opcode for increased performance.
- The Apaches are running identically-configured Apache web servers. The Apache servers accept requests from users, get data from the database if necessary, and format the requests back to the users, by running the MediaWiki software implemented in PHP with the APC PHP cache (our experience). They share their work directories by NFS, so uploads etc. should remain quite in sync.
- The Squid systems maintain large caches of pages, so that common or repeated requests don't need to touch the Apache or database servers. They serve most page requests made by visitors who aren't logged in. They are currently running at a hit-rate of approximately 75%, effectively quadrupling the capacity of the Apache servers behind them. This is particularly noticeable when a large surge of traffic arrives directed to a particular page via a web link from another site, as the caching efficiency for that page will be nearly 100%. See cache strategy for more details.
The system is designed to failover to backup configurations at both the squid and Apache levels, and backup database support is in place, but not in an automatic failover.
(Details on database replication in MySQL: [1])
Hosting
- See Wikimedia partners and hosts for more
At present all Database servers, and most Apaches and Squids, are hosted at the Florida Power Medium data center. From the start of the project until September 2004 Bomis.com paid for all bandwidth.
Kennisnet in the Netherlands is providing hosting and bandwidth for several servers since June 2005. They are installed at SARA in Amsterdam and provide service for European regions. Also the Toolserver cluster is installed here.
Yahoo! is providing servers, hosting, and bandwidth for twenty-three servers in South Korea.
Yarrow was bought by Wikimedia Deutschland, while Zedler and Hemlock were donated by Sun Microsystems to provide hosting for miscellaneous tools written by users, and are not part of the main cluster.
Cluster Nomenclature:
- pmtpa - Powermedium / Tampa, Florida, US
- knams - KennisNet / Amsterdam, Netherlands
- yaseo - Yahoo! / Seoul, South Korea
More equipment, hosting and bandwidth offers for squid cache clusters are welcome, see this page for requirements.
Orders and detailed hardware descriptions
See also: Hardware orders
2007
Quarter 1
- Hardware ordered March 2007 (~ $280,000)
2006
- Hardware purchase Sep-06 (part II not bought mid December) (~ $300,000)
- Hardware Purchase Jul-06 (~ $61,440)
- Hardware Purchase June-06 (~ $60,000)
- Hardware ordered February 06, 2006 (~ $138,000)
2005 - 129 new servers
- Hardware ordered November 15, 2005
- Hardware ordered October 18, 2005
- Hardware ordered October 6, 2005
- Hardware ordered September 14, 2005
- Hardware ordered August 30, 2005
- Donated: 23 multi-purpose servers in South Korea
- Hardware ordered May 2005: 2 db servers.
- Hardware deployed May 9, 2005: 20 apache servers (10 with 1GB, 10 with 3GB for memcached use).
- Hardware ordered January 2005: 10 new servers.
- Donated: 2 squid servers in Paris (florence, sophie)
- Donated: 8 squid servers and three multipurpose servers in Amsterdam
2004 - 39 new servers
- Hardware ordered December 2004: 5 new servers: all 3GB apache/memcached/squid type.
- Hardware ordered October 2004: 7 new servers: 2 database, 5 apache
- Hardware ordered August 2004: 10 new servers. Search database server (bacon), NFS storage server (albert), 8 3.0GHz P4 web servers (diderot, goeje, avicenna, dalembert, tingxi, alrazi, friedrich, harris), gigabit ethernet switch, 146GB SCSI drive for Suda.
- Hardware ordered May 2004: 5 new servers. Replacement for Geoffrin database server (ariel), four 2.8GHz P4 general purpose machines (maurus, rabanus, yongle) and a pair of 250GB ATA drives. Based on upgrade discussion April 2004.
- Hardware ordered January 2004: 9 new servers: 8 multipurpose machines (bart, bayle, browne, coronelli, isidore, moreri, vincent, zwinger) and 1 database server (suda).
- Donated: three for Paris squids, extra RAM for them purchased
Server list
The popularity of the Wikimedia projects necessitates the use of many servers, which run the GNU/Linux operating system.
See the Wikitech server roles page for more details than you probably want. :)
Old servers
The old web servers, which are currently not in service, were also named after historical encyclopedists:
- pliny (Pliny the Elder)
- geoffrin (Marie Thérèse Rodet Geoffrin)
- larousse (Pierre Larousse)
Donations
While Wikipedia is free in both the "free speech" and "no charge" senses of the word, running the web site does cost money. You can help with purchasing new server hardware by donating to the non-profit Wikimedia Foundation: http://wikimediafoundation.org/wiki/Donate
Status and problems
You can check one of the following sites if you want to know if the Wikimedia servers are overloaded, or if you just want to see how they are doing.
If you are seeing errors in real time, visit #wikimedia-tech on irc.freenode.net. Check the topic to see if someone is already looking into the problem you are having. If not, please report your problem to the channel. It would be helpful if you could report specific symptoms, including the exact text of any error messages, what you were doing right before the error, and what server(s) are generating the error, if you can tell. The #wikipedia channel may be better populated and the topic there may contain more updated information about the status of the problem. (But this channel is for general conversation about Wikipedia.)
If you're wondering if it's only you experiencing this problem, you can check the following sites. Unfortunately, the Wikimedia administrators do not monitor these sites for problems.
If you are getting a "connection refused" error, that is a squid problem. Determine which IP address you are trying to connect to, and ask someone to look at that host.
See also
Wikimedia Commons has media related to: Category:Wikimedia servers |
More hardware info
- Recent hardware orders, accounting summary
- Wikimedia partners and hosts
- Technical FAQ - How about the hardware?
Admin logs
- Server admin log - Documents server changes (especially software changes)
Offsite traffic pages
Long-term planning
Useful information about other sites
- Evolution of LiveJournal systems:
- 04/2004 MySQLCon 2004 PDF/SXI
- 07/2004 OSCON 2004 PDF/SXI
- 11/2004 LISA 2004 PDF/SXI
- 04/2005 MySQLCon 2005 PDF/PPT/SXI
- journals to watch for system details: Brad (Fitzpatrick) lj_backend lj_maintenance
- Google cluster architecture (PDF)
- MySQL User's Conference 2004 blog highlights