Feedly & Cassandra at Fashiolista

Accidental scaling issues From a hobby project to one of the largest online fashion communities

About Me

• Thierry Schellenbach

• Founder/ CTO Fashiolista

• Github/tschellenbach

• Feedly & Django Facebook

• Blog: mellowmorning.com

• @tschellenbach

• Fashiolista’s growth

• Pre Cassandra feed systems

• Github/tschellenbach/Feedly

– Cassandra learnings

– Remaining challenges

A long time ago

Rick, Joost, Thierry & Thijs

Launched Fashiolista at TNW

Got a few hundred users

And went back to work

Brazil?!

• Blogs• Twitter• Capricho (Teen

magazine with 1.8M followers)

Growth

2nd largest fashion community

• 1.5M members

• 17M loves/month

• 94M pageviews (google analytics)

5.000.000+14.000.000+

The team

Global Fashion Discovery

Our Stack

• Django/Python

• PostgreSQL/ Pgbouncer

• Cassandra

• Redis

• Solr

• Celery/ RabbitMQ

• AWS/ Ubuntu

• Nginx/ Gunicorn/ Supervisor

• Newrelic, Datadog & Sentry

Feed History

1. PostgreSQL

2. Redis – Feedly 0.1

3. Cassandra – Feedly 0.9

More details in this highscalability post:

http://bit.ly/hsfeedly

PostgreSQL - Pull

1. Smooth till we reached ~100M activities

2. Spikes in performance due to the query planner

Redis - Push

1. Fast, Easy to setup and maintain

2. Becomes expensive really quickly

115K Followers

Cassandra - Feedly 0.9

1. Few moving components

2. Supported by Datastax

3. Instagram

4. Easy to add capacity

5. Cost effective

We open sourced Feedly!

• Github/tschellenbach/Feedly

• Python library, which allows you to build newsfeed and notification systems using Cassandra and/or Redis

Feedly – What can you build?

Newsfeeds Notification systems

Cassandra Challenges

1. Which Python library to chose?

• Pycassa

• CQLEngine (using the old CQL module)

• Python-Driver (beta)

• Fork CQLEngine to support Python-Driver

– Github/tbarbugli/cqlengine

2. Importing data(300M loves * 1000 followers = 300 billion activities)

• High CPU load

• Nodes going down

• Start with many nodes, scale down afterwards

3. Optimizing import speed (300M loves * 1000 followers = 300 billion activities)

• Python-Driver

• Batch queries

• Non-Atomic (unlogged) batch queries

• Prepared statements

4. Data model denormalization

CREATE TABLE fashiolista_feedly.timeline_flat (feed_id ascii, activity_id varint, actor int, extra_context blob, object int,target int, time timestamp, verb intPRIMARY KEY (feed_id, activity_id) ) WITH CLUSTERING ORDER BY (activity_id ASC)

AND bloom_filter_fp_chance=0.010000 AND caching='KEYS_ONLY' AND dclocal_read_repair_chance=0.000000 AND gc_grace_seconds=864000 AND read_repair_chance=0.100000 AND replicate_on_write='true' AND populate_io_cache_on_flush='false' AND compaction={'class': 'SizeTieredCompactionStrategy'} AND compression={'sstable_compression': 'LZ4Compressor'};

Opscenter is great

Opscenter & Datastax AMI are greatFor startups Enterprise is also Free

Evaluation

7 instances, m1.xlarge, 2.59 TBCassandra 2.0.0, CQL3, Python-driver(Would have been one expensive Redis cluster)

Current challenges

Average load times are good, but 99th percentile sometimes spikes

Current Challenges

How do we limit the storage for feeds?

Trimming?(Not supported)

DELETE from timeline_flat WHERE activity_id < 5000

Use a TTL on the rows?

Fork Feedly

This is our first time using Cassandra, let us know how we can further speedup our implementation:

http://bit.ly/feedlycassandra

Check out Feedly atGithub.com/tschellenbach/Feedly

Ask questions, Give tips to these guys:

Thierry Schellenbach Tommaso Barbugli Guyon Morée

Feedly & Cassandra at Fashiolista

users of fashiolista

point fashiolista

launched fashiolista

mini fashion

member of fashiolista

largest fashion community

new fashion finds

favourite fashion finds

Technology

TUTORIAL FEEDLY POR NELSY RAMOS

Tutorial Lector RSS Feedly.

Feedly AppStories - 4 lessons learnt

Tutorial compartir contenido de Feedly en Facebook

Tutorial de feedly

How to use feedly with Zapier

Tutorial feedly RSS

All top feedly screenshots

DailyHunt, Flipboard, Feedly,Newser | Company Showdown

Tutorial lector de contenido feedly

Que es feedly y como usarlo

Arthur Bodolec of Feedly on Designing With Your Ears

Feedly ﻖﻴﺒﻄﺗ - sheikhah aldawood · 2015. 5....

Menjaring informasi dengan feedly

12 13.IKT SAIOAK. FEEDLY = GOOGLE READER

Information self-service: RSS + feedly