Thursday, February 25, 2010
Continues to be lots of interest and activity in Ruby. Conferences in San Francisco and Phoenix announced for this year. Jobs are available, especially in San Francisco.
Product Manager - Slingshot Labs, incubator for News Corporation
Looking for a senior Rails developer.
Josh Susser and Jim Myer
Golden Gate Ruby Conference
Announce Fall conference - Sept 10,11 2010 (fri/sat)
Primary Sponsor for Today's Conference
Buzz.com - social website. Have t-shirts. NOthing to do w/ Google
Derek w/ Sunnyconf in Phoenix
LA Ruby Monthly Meetings
Alf Mcgollough 2nd Thursday of each month.
Looking for west side venue that can handle 50 people.
AT&T interactive hiring
- Ruby and data skills
- Ruby and web and UI skills
- Service type work in Ruby/Rails
Place to pay money to fix open source bugs.
Check out the site.
Ron Evans - thanks to organizers
Large databases are very sensitive to mistakes that don't affect smaller databases. Anything that causes scanning of many records will not only run slowly, it frequently causes the entire web site to go down. Prevent scanning by careful application of indexes and avoiding data transformation operations when querying.
Tim Morgan - scribd
- We learn rails when we break it
- scribd is a really large web site (Amazon S3)
- Large sites are easy to break.
These are the ones you remember for a long time (along w/ everyone else).
- Almost always problems of scale.
- Almost always about how Rails interfaces w/ database.
- Postgres/Oracle, but MySql is what is being used.
- If you add something that is a SQL request, look at the SQL itself.
- Always understand the queries your code is generating. Look at the query log.
- Test with a heavily populated database. If you find it sucks, think what your customers think.
- Pay close attention to your indexes.
- MySql and Postgres have very different implementations of bow their indexes work.
The problem with find_in_batches.
- Doing it using User.all.each is going to be very slow.
- Better to use find_each; uses batch for x items per request.
Composite primary keys? Use composite keys plugin.
http://gist.github.com/105318 is a monkey-patch
- :case_sensitive => false
- took site down.
- problem was SQL lower(login_name)
- In mysql, make the case-insensitive column binary.
- solution http://gist.github.com/105367
problem with delete and destroy
- caused by misleading Rails documentation
- delete_all is faster than destroy because it doesn't use hooks.
- before_destroy positioning is important, must be placed before any associations.
- delete_all() doesn't remove the link table record itself; it just sets the id column to nil.
- Solution: use delete_all
- CategoryMembership.delete_all :category_id => self.id
- Fixed in Rails 3.0
problem with indexes
Think about indexes early.
- mysql uses only one index at a time, so you may have to figure out an index on multiple columns.
- you may have to tell MySql which index to use:
- use index (index_documents_on_user_id).
- Always understand the queries your code is generating.
- Test with a heavily populated database.
- Pay close attention to your indexes.
Fascinating talk on using probabilistic data structures to save oodles of search time if a limited number of false indicators can be tolerated.
Tyler McMullen - Scribd
Different Data Structures
Some very interesting structures
- Bloom Filter
- Splay Tree
- Tests for existence in a set
- Minimal memory use
Example: 100million strings in a set
Tradition set: 10gb minimum vs 280mb
How does it work?
Binary sequence. Uses hash
In places where occasional false positives are okay
find items within a distance of a target
reduces search space
works inside a metric space
If we know the distace between 2 of 3 points, then we can make assumptions about the distance between the remaining "unmeasured" two points.
- Most often used for spelling corrections
- Work in any metric space
- Reduce the search space.
- Self-blancing binary tree
- Brings most accessed items toward root
- The more uneven the access pattern, the better the performance.
Good for caches, garbage collectors, etc.
- O(1) (order 1) on lookup, add, removal
- Ordered traversals
- Prefix matchine
- Excellent memory management.
Useful as an autocompleter.
Interesting, he implemented this as a rack filter.
It's great to see Sarah evangelizing software training to kids. I've thought about it for a long time and her presentation will spur me to do it. Very good hints on how to do it so that both you and your students both thoroughly enjoy it and become better practitioners.
Sarah Mei - Teaching Ruby to Kids
Teaching is her hobby.
Most programming instructors = FAIL
Teacher needs to be a coder.
Programming is becoming part of basic literacy.
Why should you teach?
- Teaching leads to learning by the teacher.
- Teaching not rocket science.
- set goals
- form a plan but expect to adapt
- keep iterations short
Form a plan
- What do I start with?
- Keep your goals in mind.
- Software Teaching Tools:
- Hackety Hack
- Small Ruby
- Kids love anything visual
- Anything interactive
- irb: compelling for kids (maybe)
- Install all the tools you might use on all the computers the kids have access to.
- start small
- Use the internet.
- Your "lesson plan" should be a series of very small steps.
- 15 minutes or less
Listen to the customer
- Follow tangents!
- don't stick to a plan because it's the plan.
- Don't worry about "finishing"
- Look for teachable moments.
- Look for signs they've turned off
"Ruby: the programming language for extroverts"
- Do it often, practice
- Teaching is a learned skill.
- Take all opportunities you can to teach.
- talks at your local meetup
- pair programming
- summer camps, etc, need volunteers
- National Lab Day
- In SF, I always need teachers for intro workshops.
Expect some things you try to fall flat.
- Some students won't engage
- Keep at it.
- You should teach
- You can teach
- Agile is form more than just development.
Ruby is a great first language.
For really young kids:
- Kodu (Microsoft)
- ISTE has curriculum for elementary school.
- cs-unplugged (a web site)
These are drag-and-drop environments.
Bjorn showed New Relic in action with how it monitors web sites and identifies issues early and clearly. Unfortunately my blog entry here fails to capture much of it as it was mostly demo. Worth looking into.
Bjorn Freeman-Benson - New Relic
Building the First Successful Human-Powered Airplane
1977 - gossamer condor
Why did Paul succeed?
How to make the lightest possible airplane as quickly as possible? Ended up crashing a lot.
Could repair in less than 12 hours. Others crash repairs 6 months.
Macready team could iterate faster.
Applied to Software Development
Presenter wants to be that agile.
How he uses the New Relic to do that.
30 days of RPM Gold for free.
Monday, February 22, 2010
Luigi showed how government has lots of useful data but few tools to make sense of the data. Here's where software developers can make a contribution: to build free tools to make this data more accessible. He talked about the various opportunities and why accomplishing them would make a real difference.
Purpose: get government to open up its data and provide software tools to comprehend it.
Over a thousand people in their effort.
"D.C. is Hollywood for ugly people."
- Electoral Politics no
- Governance yes
- Open source
Civic Side Projects
- Challenging entrenched bureacracies
- Open source + Open data = better Government
- Government opens data; they write apps aeround it.
- Government as a wholesaler, not retailer.
Sunlight Labs API
- Bio and contact info for elective office holders.
- Example: how much health insurance money has been spent politically and how.
GovTrack.us - Bills and Vote Records
MAPLight.org - Vote Influence
- How representative voted correlated w/ donations.
Code for America
- Will choose 5 cities
- 5 developers will be supplied to each of those cities.
- Modeled after Teach for America program.
- #transparency on Freenode
- github didn't get actual repository
- Enhance your skillset
- Low risk, high reward
- Another testing framework? Really?
- Local/state govts. an untapped market
- Solve a hard problem.
- David Cameron: in a Ted talk "The next age of government"
The traditional Ruby engine is paranoid about memory management because it has to run in so many disparate environments. This negatively impacts the garbage collection performance. If you know or can define where your Ruby installation will run, you can do optimizations that will greatly speed this process. Indeed, this is one of the benefits of Ruby Enterprise, and Ruby 1.9 accomplishes a subset of the optimizations discussed here.
Note: this session went extremely fast and I was not able to collect the notes as I wanted.
Joe Damato and Aman Gupta
Garbage Collection and the Ruby Heap
- Why GC
- Ruby is simple and elegant
- GC makes life easier.
- No more memory management
- Menory management
- memory leaks
- Always allocated on heap
- Fixed size
- sizeof(struct RVALUE) = 40
See their site to see how to optimize the GC.
Ruby memory leaks:
- These are reference leaks
memprof - replacement for gdb.rb and bleak_house
Good discussion of the different facilities available to Ruby and the underlying operating system and their tradeoffs.
Aman Gupta - Joe Damato - Threads
http://timetobleed.com Joe's blog.
What is a thread?
A thread is just a set of execution state.
- Green threads
- Kernel doesn't know they exist
- Implementation is in userland.
- Create lots cheaply
- Switch them.
- Schedule them however you want.
- Main one is that these can switch only between a single Ruby process.
Native Threads 1:1
- Kernel knows they exist
- Some user land code.
- Take advantage of SMP
- Shared memory
- Blocking in one thread doesn't block
- didn't get
Hybrid Threads (M;N)
- Take advantage of SMP
- Cheap setup and teardown
- Need 2 schedulers
Ruby 1.9 and Erlang use hybrid threads
- Operating system switches process regardless of process states.
- thread gives up voluntarily.
lsof - "list of open files" - a utility. Can also be used to get a list of open sockets.
trace system calls and signals.
Ruby: SIGVTALRM used
- heap_stcks branch
- heap_stacks_186 branch
- fibers branch
A warning that the rate of change in underlying software development paradigms will require new mental approaches to the large software challenges lying ahead of us. The "algol-based" languages used by the vast majority of developers will yield to more scalable functional languages.
In the meantime, continue to grow you skills and constantly learn how to take advantage of tools to get more bang for the time you spend designing and coding.
A new look at software development - What will the next 10 years bring?
- You're doing it completely wrong.
- Software is hard
- Software construction is the most complex endeavor ever undertaken by mankind.
- The only software that's worth making is software that does something new.
- It's only getting (more)...
- more complex
- life critical
30 years of software
- popular languages
- flavors, blends, derivatives
- Fortran->Algol 54/58
- Lisp 58
- Ruby described as new-age lisp.
- Smalltalk from lisp from simula
- Prolog --> Erlang
- ML --> Haskel
Time for a change?
Let the computer do more work.
- What you want to do instead of how.
strong type systems
- language agda
Tools that cooperate
- real-time analysis of our codes.
Do more for the developer
- Giving some insight into the code you're writing.
- Static analysis
Generated test suites
- Identify boundary cases.
Will current ideas continue to server us?
Speaker says NO!
new way of programming
We need parallel strategies
- problem decomposition
- data structure design
- algorithmic organization
- better languages
- better tools
- tools that help us.
Google's "Go" language
- everything is parallelized.
parallel and distributed baked in
- that actively prevent bugs
closing quotes: (Guy Steele):
- The bag of programming tricks that has served us for 50 years is the wrong way to think going forward and must be thrown out.
- The great tricks of sequential programming don't work.
- It's a parallel world of parallel problems.
- Have strategies that assume imperfection. How do we write code that way?
Programming for mobile devices is a lot more than adjusting to the smaller screen; there are additional opportunities in the mobile devices themselves: GPS, camera, motion sensors, etc. However, they have significant challenges: comparatively primitive development environments, and operating systems. These are discussed along with how HTML 5 and Rhomobile are working towards a single programming API across mobile platforms.
This was presented by Sarah Allen of Blazing Cloud (gotta retrieve the presenter's name)
Rhomobile framework call "Rhodes"
Mobile app development sucks:
- In some ways is archaic
- Old languages
Brand Transcends Platform
My brand instead of cell phone brand. (Do I agree w/ this?).
Mobile gives you more than desktop:
- Everyone you know is connected
Means you have different opportunities than desktop.
write Once - Run Anywhere
How to get code onto the device.
- Rhodes is similar to Rails.
- Views are HTML.
- It all works within the device. So a kind of HTML processor is inside the device.
It's also analogous to Rails:
- Controller -> RhoController
- Model -> Rhom
- View -> eRB files
The primary danger in the major rewrite is proceeding before the business has bought into it; this is usually fatal (or you end up wishing it was fatal). Along with the important technical skills needed, this talk identifies how to know when the business is behind it, and how to help to navigate the business to support the rewrite. (Or, to determine that it's not a good idea to do the rewrite at this time.)
The Big Rewrite, Doing it Right
Rich Kilmerbtw, he used a presentment software package called Prezi which was very effective with swirling text while zooming.
Drivers for a Rewrite
- Must be business driven
- Must NOT be technology driven
- Don't call it a rewrite
- Complete in a major release cycle
Preparing for a rewriteDrop a major release before you start.
- One the customer is really happy with.
- Understand your domain
- Or have a domain expert available all the time.
- Break down the current system into logical sets of functionality. (Rich later showed the resulting code which was incredibly clean; this made a (hopefully) lasting impression on me.)
- Choose the right technology for what you want to do. Examples:
- Develop a standard worker framework (minion, resque)
- Dedicate resources to repeatable data migration
- Keep services code consistent, models clean
- Use the right tool for each job
- Perform incremental migrations of historic data
- Prepare business users for potential disruptions
- Run flip scenarios several times
- Enable "read only" system during final lip (if needed)
- Provide a way to fall back if the flip fails
- Don't code for assumptions.
- If you find that you want to use the same name for two different classes, you may have two different domains which might need different applications.
- Design for expectation that backing up separate systems will probably not backup synchronously. So be prepared to recover disparities.