Lessons from Amazon Web Services

(back to the tools page)
very expensive for 24/7 server but very inexpensive if used for on-demand storage/computation consequently, do not use it as your main hosting platform except if get direct revenues from it (stick to having your own actual server in a datacenter, no VPS, no cloud) if you want to rent to a client for a short period of time or you can charge him a higher cost and minimize your setup cost (through an AMI) it's nice too
Notes
- IRC channel ##AWS on Freenode
- my notes on CloudCamp 2009 in Paris
(summer 2009)
- H/W OpenGL at Amazon Web Services Developer Community
- CAPS Compute Lab
- EU-West and US-East
- different data in the console (Key pairs, Instances, ...)
- separated bills?
- different data in the console (Key pairs, Instances, ...)
- Programming Amazon Web Services by James Murty, O'Reilly Media 2008
Authenfication
- Putty error on Windows "Unable to use key file "my_key.pem" (OpenSSH SSH-2 private key)"
- open it with puttygen my_key.pem
- save it to my_key.pem.ppk
- load it with it putty -i my_key.pem.ppk
- AWS security credentials
Billing
- date
- for EC2 and S3
- "You will be charged for your usage of this web service and any applicable taxes on the next billing date"
- which is better for the cashflow
- non Amazon tools
- CloudSplit Real-time Cloud Analytics. Real-time spending insight. Real-time cost control.
Tools
- cloud exchange listing spot prices over time
- Using the AWS Console with Amazon EC2 by Mike Culver
- Public Data Sets on AWS centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications.
- AWS Simple Monthly Calculator Note: 1 Month ~ 732 Hours
- Usage Report for EC2 and S3
EC2
- Latest User Guide for EC2
- official EC2 tutorial recommended by Sylvain
- Amazon EC2 AMI Tools from Amazon Web Services Developer Community
- includes ec2-ami-tools-version, ec2-bundle-vol, ec2-download-bundle, ec2-migrate-manifest, ec2-upload-bundle, ec2-bundle-image, ec2-delete-bundle, ec2-migrate-bundle, ec2-unbundle
S3
- S3Fox Organizer(S3Fox) Firefox browser tab with a GUI similar to dual-pane layout FTP clients
- s3fs FUSE-based file system backed by Amazon S3
- Installing FUSE + s3fs and sshfs on Ubuntu by Eric Marden, xentek May 2009
- torrent S3 files
- "Any publicly available data in Amazon S3 can be downloaded via the BitTorrent protocol, in addition to the default client/server delivery mechanism. Simply add the ?torrent parameter at the end of your GET request in the REST API." according to Amazon Simple Storage Service FAQs
- S3fm, web-based S3 explorer
Hive
Moved to ApacheProjects#Hive
VPC
- Amazon Virtual Private Cloud (Amazon VPC) "secure and seamless bridge between a company’s existing IT infrastructure and the AWS cloud."
- limited beta account request on 04/10/09
- according to Wikipedia a Virtual private cloud is "a private cloud existing within a shared or public cloud (i.e. the Intercloud)."
- Introducing Amazon Virtual Private Cloud (VPC), Amazon Web Services Blog August 2009
- nice visuals
MTurk
- The Mechanical Turk Blog (official)
- TurKit Java/JavaScript API for running iterative tasks on Mechanical Turk
- TurkPipe batch Amazon Mechanical Turk jobs at the command line
- creative integrative projets
- Vision At Large large scale data collection for computer vision research
- VizWiz Nearly Realtime Answers to Visual Questions
- soylent Microsoft Word plug-in utilizing Mechanical Turk
Novelty discovery HIT
Facilitate novelty discovery.
Solution
gather links following few simple rules
- has to be new (not already part of the list)
- can not be "people"
Examples
http://www.ifi.uzh.ch/ddis/research/semweb/simpack/ 2005 2008
- will earn X
http://www.e-lico.eu/ 2009 2012
- will earn Y
without starting date or without finishing date each project will earn less
Difficulties
- limit to research project? labs? published?
- consider "grapes" of projects, for example ITER or LHC are spawing a myriad of sub-projects
- checking the result without wasting time
- quality of each link knowing that it will probably be hundreds of them
(( ask for other turkers to check it but Im afraid they would be too "positive" since I guess there is a kind of turkers community (since they are dedicated websites already)
- only allowed for US resident to pay for tasks
Notes as a requester
Economical arm race (Seedea:Research/Drive) between
- the person wanting the information (Requester)
- the community (including but not limited to the official https://requester.mturk.com/mturk/welcome )
- unorganised individuals
- AI researchers
- organised individuals
- practionner websites
- legal entities
- unorganised individuals
Work is done (or not) as when the equilibrim is reach between perceived
- price defined by estimation of
- maximum number of items (HITs)
- total budget
- value of the processed result
- thus based on the ulterior motive
- amount of work required per task
- amount of work required to handle the result of every task
- price defined by estimation of
The community is not homegeous regarding its skills and its economical needs, consequently one might prefer a distribution rather than a over simplified view.
From the requester point of view, the price could also be started as the lowest point then increased over time in order to cover the larger number of item at the lower price then increasing, presumably based on the difficulty to conduct the remaining HITs. This is probably specific to this problem though. Also note that if the community perceive that prices increase over time, it could give the incentive to delay participation.
See also
- WithoutNotesMay11#LiquidPub for "diversity-aware search"
- InformationRules
- LeanThinking
- http://behind-the-enemy-lines.blogspot.com/search/label/mechanical%20turk
- Seedea:Seedea/Onlineoutsourcing
for altnernatives, including CrowdFlower and its API
Provide to turkers
- provide tem the tool to do so better (inc. collab ?)
- seed the DB live with
- RSS feed of http://www.nsf.gov/funding/pgm_list.jsp?org=NSF&ord=rcnt
- the equivalent of the for each country
directly check source in scientometrics research
See also
- Seedea:Seedea/Onlineoutsourcing
for outsourcing online in general
- Needs for more potential tasks
Legislation
- DevPay Availability in non-US countries, Amazon Web Services Developer Community
- still not available as of August 2010
October 2009
- still not available as of August 2010
- Estimated taxes
- "for business purposes [...] contact us at webservices@amazon.com and provide your VAT registration number and address/details about your company"
To consider
- Who does that server really serve? by Richard Stallman, FSF
- autonomo.us Toward Free Network Services
- GNU Affero General Public License (AGPL), FSF
- Free Network Services by Bradley M. Kuhn, LibrePlanet, March 2010
- When Will Hosting Sites Allow AGPLv3 Code? by Bradley M. Kuhn, SFLC Blog 2008
To explore
- cloudlets universal server images for the cloud.
- Cloudlets: universal server images for the cloud, FOSDEM February 2010
- Economic Models and Algorithms for Distributed Systems Springer 2010
- running Apache Mahout on EC2
Note
My notes on Tools gather what I know or want to know. Consequently they are not and will never be complete references. For this, official manuals and online communities provide much better answers.
CONTENT
CONTACT
UPDATES
LAST TWEET
RSS for this page only


