In my first week at our old 90 Seconds office — 47 Craig Road, Singapore

Last year I joined 90 Seconds with the original goal of building the data team and data analytics products. At the time, however, there was a lack of engineering leadership and management in place. So I was tasked with managing and scaling up the engineering team as the interim director of engineering — a role, and a challenge, I had not taken up on before.

It was undoubtedly one of the most challenging periods in my professional life. Yet I am very happy that I’ve got the opportunity to learn, to make mistakes, and to experience the whole thing. …


90 Seconds Auckland— Office from above

At 90 Seconds, our team’s main responsibility is to lead and build data analytics capabilities across the organization. To ensure that we can fulfill this duty, we have to build up the data team, with clear structure, responsibilities, and directions.

One of the first tasks we set ourselves to was to define the goal of the team, the different functions, and how we all fit together.

The team goal

Regardless of where you sit within the data team, you have one goal: to make positive impact with data.

This can be achieved through:


This is more or less my personal note and also, maybe a guide for people who have similar specs machines wanting to try Hackintosh. This whole thing took me about 2 full days in total to get into the state that I’m comfortable with i.e ditching my Macbook Pro for good.

Special thanks to Lich Nguyen for helping me going through from the start with a lot of newbie questions!

No longer need the Macbook Pro 😁

Why do I do this:

  1. Macbook Pros tend to have very loud fan noise when processing basic tasks while connecting to multiple external monitors. This has been my pain points for years now, even with…

Being in this field for a while now, I think this is quite an interesting time to write a summary from my personal experience on the changes in the last few years, the new directions, or simply something we should do to keep up with the trend in the future.

There will be two parts of this blog, the first one (this one), being about Big Data and Business Intelligence, while the second one will be about Machine Learning and Data Science in general.

1. Big Data, frameworks and tools

https://blogs.oracle.com/oracle-data-warehouse-solutions-business-value-from-data-center-to-cloud

A couple of years back, most companies would stick with Enterprise Data Warehouse solutions from Oracle…


There are two parts of this blog, the first one, being about Big Data and Business Intelligence, while the second one (this one) will be about Machine Learning and Data Science in general.

3. Machine Learning

In the past few years, Machine Learning has become a lot more accessible and even more and more demanding nowadays:

  • Companies are realizing that they have to employ Data Analytics and Machine Learning capabilities in order to gain competitive advantages for their business
  • Schools are having Machine Learning and Data Analytics courses that are taught by experienced people in the field, with very practical mindset and approaches

Background

As someone who has been researching and taking part in shaping up data organization within various startups of different stages, sizes, and industries, I find this an interesting challenge to be focused early on most of the time. Making the right decision on this front will not only help companies reduce cost, minimize duplications of work, turn-over rate, but also remove dysfunctional company cultures.

With that in mind, I will share some of my views on how structures could be defined in different data organizations. …


In this post, I will share with you a simple process that I have been developing when doing Machine Learning in my workplaces.

Hopefully it will clear off a few misconceptions and pitfalls some of us might have in general about machine learning, or when it comes to comparison between machine learning in competitions, in text books, and in practice.

Step 1: Problem definition

Most of the time, these problems are clearly presented to us data scientists from text books, or in Kaggle-like competitions, together with a pre-defined dataset, a baseline result, and an evaluation metric that we will have to follow.

In practice…


Note: this works for xgboost-0.6a2 and OS X Sierra (10.12.3).I have not tested it else where.

Install Brew and Python

/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"export PATH=/usr/local/bin:/usr/local/sbin:$PATH# upgrade too if you can
brew upgrade
# install Python from brew is highly recommended
brew install python

Note: OS X does come with python (2.7.10) by default at /usr/bin/python but I would highly recommend brew over this.

Install Pip

# install from source code
curl -O http://python-distribute.org/distribute_setup.py
python distribute_setup.py
curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py
python get-pip.py
# show current pip version
pip --version
# upgrade pip
pip install --upgrade pip

Note: you can always do sudo easy_install pip like…


— by Dat Le on 23th Feb 2017

Consistent SQL style would help a lot in code review and development process, especially when SQL is the main language we use in the data team at honestbee.

Below are all the rules and conventions:

Formatting

  • All keywords need to be capitalized, it helps with readability, even if no syntax highlighting available.
  • Proper indentation is also important for code readability
  • Table references should always be used when more than one table is referred in a query
  • Avoid using alias unless absolutely necessary, naming should be underscore-separated

JOINS

  • Always specify which JOIN function to…

— by Dat Le on 13th Feb 2017

Shibuya Crossing — Tokyo, Japan

Background

Like many other businesses, honestbee operates on a supply (shopper bee and deliverer bee) & demand (consumer) model, with our demand side being much more volatile in real time comparing to its counterpart.

As our team’s duty is to ensure the best service to our customers, we need a mechanism to flatten our capacity curve for a smoother operation. As a solution, we launched the “Dynamic Time Slot Pricing” feature earlier last week:

Le Nguyen The Dat

Data Science and Engineering at foodpanda

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store