Data science in oil and gas industry

I've been doing data science consulting at Ayata for oil and gas industry clients. The challenges these clients face are a bit different from the ones from typical Internet-based companies. Based on the talk I gave at UT Austin recently, I will explain some of the main data science challenges that most of the oil and gas industry have.

Future directions for the Kuler project

For the past several posts, I have discussed a brief overview of the project, and wrote about each process from data wrangling to machine learning. There are a lot of rooms for improvement for this project starting from automating webscraping and tweaking a machine learning algorithm to including other interesting variables in consideration. However, I am… Continue reading Future directions for the Kuler project

Clustering colors in a 3D space

Now that we know each color theme can be represented as a 5-point spatial pattern in 3D space, we can use an unsupervised learning algorithm, specifically a clustering algorithm, to cluster a certain number of themes that have similar patterns into groups. First approach: Hierarchical Clustering My first idea was to run a clustering analysis for… Continue reading Clustering colors in a 3D space

Color perception and color-space conversion

Before analyzing the color data, we should first know that the RGB color space is not a good representation of nonlinearity of color perception. A color space that is considered perceptually most uniform is CIELab (aka Lab) color space. In this space, each color is represented by three coordinates: L (brightness), a (red-greenness), and b(yellow-blueness). L ranges… Continue reading Color perception and color-space conversion

Adobe Kuler

This is an introduction to a toy project that I worked in 2015 when I was applying for Insight Data Science Fellowship. This was my first data science project (still unfinished) using unsupervised learning for clustering popular color themes in Adobe Kuler. I will talk about important steps in the project in the following posts.

Measuring irrelevant memory

One of the projects I have been working on is to measure one's irrelevant memory. Wait, what? Yes, irrelevant memory. Let's say there is an object with color and orientation, like an ellipse with a color. I ask you to memorize orientation. Now, orientation is relevant and color is irrelevant. If I want to test whether color information has been automatically registered (or encoded) in your memory, what shall I do? There is one way to test this. Let's say we have 100 trials of an experiment. For the first 99 trials, I ask you to only memorize orientation from a display. At the end of each trial, I test your orientation memory. But at the 100th, last trial, I ask you to recall color. And yes, you did not see this coming. That is the most important part : you should not know about the last trial!