Social Media Masterclass – 25/5/2013 Circular Quay

More of my MBA masterclass notes, this time from the session on Social Media, hosted at the Sydney Business School’s Circular Quay campus.

Social media is intrinsically linked to smart phone and tablet usage and consequently social media is popular in Australia because there’s an enormous consumer appetite for mobile technology. For many companies participating in social media is about relinquishing control of their marketing communication channels and it’s this difference that often catches brand managers off guard – for example, when Qantas launched their “luxury is” first class social promotion.

The speed and amplification of news through social channels means that trends can escalate quickly. Some can be positive others can be negative. The general rule thumb seems to be that a business has about 4 hours to contain material online before it starts reach ‘critical mass’ of re-posting, re-tweeting etc.

There is high level of convergence between traditional media and social channels. News media outlets constantly monitor social feeds for trending topics, hence bad news on social media can escalate quickly once traditional media picks a story up. In many ways traditional media is becoming and aggregator of online and social content, for example= the SMH trawling facebook and twitter for emerging trends.

“One size” doesn’t fit all applications when it comes to social media management. It needs to be tied to the company and marketing strategy. There is no bullet-proof list of reasons for an organisation to start using social media, and in some cases it might even be detrimental. An often over-looked fact is that there is typically a high level of risk associated with social channels that can be prohibitively expensive to manage correctly.

There is a 70-20-10 rule for spending the marketing budget.
70 % of the spend goes on tried and true channels
20 % of the spend goes on innovation/diversification channels
10% goes to experimental – this is where the coca cola named cans campaign came from

Ultimately social is just another channel in the marketing mix but it does seems to be forcing business leaders to become more accountable. “The new price of ‘doing well’ in business is ‘doing good'”

Lean & Agile Startup Masterclass

The Sydney Business School’s MBA curriculum involves compulsory masterclasses on contemporary business topics. These are my notes from the lean/agile startup session – 20/4/2013 at the Innovation Campus.

Small businesses (less than 20 people) account for over one third of private industry output.

Theres a lot of hype surrounding lean start ups and the stereotypical start-up is a hip, innovative IT company, based in Silicone Valley, producing niche applications for smartphones or the web. However, it’s not cool cutting edge tech that makes or breaks a start up business. Instead, building a successful lean start up is all about finding an unmet consumer or business need and providing a solution that fills that need in way that people irresistible.

There’s enormous cross-over between start ups and marketing which I’d never really seen prior to studying marketing. Previously I’d always thought of marketing and being just branding and promotion but after starting my MBA studies I’ve realised that marketing needs to play a much important role in business. Marketing is about understanding and fulfilling the needs of customers which, aside from making money, is sole reason most commercial organisations exist. Just because you’ve got a great product or service to take to the market, you’re not going to be instantly successful. Your product needs to provide an irresistible value to a large enough body of people in order be successful.

There’s lots of really good business lessons to be learned from the lean/agile movement. While slick, refined, well-produced products are very attractive things to take to market, the best way to build a great product is to focus on developing an Minimum Viable Product (MVP) and then iteratively improve it. Focus on the attributes of your product that provide a real palpable benefit. If your potential customer can’t see a use for your product (or aren’t willing to pay for it), you go back to the drawing board. Change your design, develop enough just enough to test your design, direction, theory, and assumptions – then go back and test it with your customers. Keep repeating until you get it right.

Installing PHP & Oracle PDO Drivers on Ubuntu

Given that PHP and Oracle databases are fairly mainstream platforms these day, you might expect that there’d be a simple, straight-forward way to install and connect PHP with Oracle on Ubuntu.

Nope.

If it’s not something you’re doing on regular basis or unless you’ve got a fool proof set of recent instructions, it can be really, really painful.

I faced the problem last week while configuring a a new virtual server. After a few hours of bumbling around I finally a decent guide here for an earlier version of the Oracle instant client drivers. On the off-chance that it saves someone else some hassles, I thought I’d post a slight updated version that will work for PHP5.3 and Instant Client 11.2. Chances are that I’ll need to refer to this guide in the future, too.

I started with a brand-spanking new VM, running Ubuntu 12.04. Your mileage may vary if attempting to install over the top if earlier versions etc. You’ll need root privileges to run many of these commands, so you might want start a root shell (sudo su -).

First, fetch instant client and SDK packages from Oracle: http://www.oracle.com/technology/software/tech/oci/instantclient/index.html
For 64bit Ubuntu, you’ll need to grab both “instantclient-basic-linux.x64-*.zip” and “instantclient-sdk-linux.x64-*.zip”.

Put both downloads somewhere convenient and extract:
mkdir -p /opt/oracle/instantclient
cp ~/instantclient-basic-*-*.zip /opt/oracle/instantclient/
cp ~/instantclient-sdk-*-*.zip /opt/oracle/instantclient/
cd /opt/oracle/instantclient
unzip ./instantclient-basic-*-*.zip
unzip ./instantclient-sdk-*-*.zip
mv instantclient*/* ./rmdir instantclient*/

Next we need to create some symlinks so that the oracle shared libraries appear where they need to:
ln -s libclntsh.so.* libclntsh.soln -s libocci.so.* libocci.so
echo /opt/oracle/instantclient >> /etc/ld.so.conf
ldconfig

The following isn’t strictly necessary but if you need a TNS names config. You put the config details into sqlnet.ora file.
mkdir -p network/admin
touch network/admin/sqlnet.ora

Now install apache, php etc:
apt-get install --yes php5 php5-cli php5-dev php-db php-pear
apt-get install --yes build-essential libaio1

Install the pecl oci8 extension:
pecl install oci8
When prompted for the ORACLE_HOME path, enter “shared,instantclient,/opt/oracle/instantclient”

Add the configuration to your php.ini files:
echo "# configuration for php OCI8 module" > /etc/php5/cli/conf.d/oci8.ini
echo "extension=oci8.so" >> /etc/php5/cli/conf.d/oci8.ini

Now we install pdo_oci. It hasn’t been updated in a while so a few bits of fancy linking is in order…
cd /usr/include/
ln -s php5 php

cd /opt/oracle/instantclient

mkdir -p include/oracle/11.2/
cd include/oracle/11.2/
ln -s ../../../sdk/include client
cd -

mkdir -p lib/oracle/11.2/client
cd lib/oracle/11.2/client
ln -s ../../../../ lib
cd -

pecl channel-update pear.php.net
mkdir -p /tmp/pear/download/
cd /tmp/pear/download/
pecl download pdo_oci
tar xvf PDO_OCI*.tgz
cd PDO_OCI*

### copy the lines below into the file "config.m4.patch"
*** config.m4 2005-09-24 17:23:24.000000000 -0600
--- /home/myuser/Desktop/PDO_OCI-1.0/config.m4 2009-07-07 17:32:14.000000000 -0600
***************
*** 7,12 ****
--- 7,14 ----
if test -s "$PDO_OCI_DIR/orainst/unix.rgs"; then
PDO_OCI_VERSION=`grep '"ocommon"' $PDO_OCI_DIR/orainst/unix.rgs | sed 's/[ ][ ]*/:/g' | cut -d: -f 6 | cut -c 2-4`
test -z "$PDO_OCI_VERSION" && PDO_OCI_VERSION=7.3
+ elif test -f $PDO_OCI_DIR/lib/libclntsh.$SHLIB_SUFFIX_NAME.11.2; then
+ PDO_OCI_VERSION=11.2
elif test -f $PDO_OCI_DIR/lib/libclntsh.$SHLIB_SUFFIX_NAME.10.1; then
PDO_OCI_VERSION=10.1
elif test -f $PDO_OCI_DIR/lib/libclntsh.$SHLIB_SUFFIX_NAME.9.0; then
***************
*** 119,124 ****
--- 121,129 ----
10.2)
PHP_ADD_LIBRARY(clntsh, 1, PDO_OCI_SHARED_LIBADD)
;;
+ 11.2)
+ PHP_ADD_LIBRARY(clntsh, 1, PDO_OCI_SHARED_LIBADD)
+ ;;
*)
AC_MSG_ERROR(Unsupported Oracle version! $PDO_OCI_VERSION)
;;
#EOF

patch --dry-run -i config.m4.patch && patch -i config.m4.patch &&
phpize ORACLE_HOME=/opt/oracle/instantclient

./configure --with-pdo-oci=instantclient,/opt/oracle/instantclient,11.2

make && make test && make install && mv modules/pdo_oci.so /usr/lib/php5/20090626/

Finally, add the configuration lines that enable the pdo_oci extension:
cat - < /etc/php5/apache2/conf.d/pdo_oci.ini# configuration for php PDO_OCI module
extension=pdo_oci.so
EOF

Check that everything succeeded:
php --info | grep oci

GoogleRefine – An introductory tutorial

A ‘big issue’ that university research managers are often grappling with is the variable, messy nature of bibliographic data. Authors don’t always use the same names, sometimes different people publish under the same, research is published at different campuses, locations, & countries, publishers & online indices don’t format data consistently, and, of course, typographic & human error are all factors that can impact the quality of research activity data.

Typically cleaning this data is labor-intensive manual process – the data needs to extracted in format that is useful and manageable. Normally it’s a spreadsheet application used to handle the data after it’s been extracted. Because of the nuanced variations and conventions in bibliographic dataset,  cleaning it usually means inspection, comparison and cognitive processing – it’s the sort of work that would make a fantastic study in machine learning, if only there were enough time and budget to teach a computer how to make sense of the data! After cleaning, sorting, de-duplication & re-organising, the data typically needs to be converted into a useful format, then fed back into research information systems, databases and reports.

The traditional approach to managing this data has been to capture and maintain it by hand. However, with increasing frequency it is being automated using scholarly index services like Scopus, Web of Science, Google Scholar & others, and this opens up some big challenges (and exciting possibilities) for keeping research metadata clean, accurate & reliable.

GoogleRefine is a tool that makes managing this type of data really, really easy. In it’s basic form the tool allows users to load data from various formats and sources, perform transformations and manage merging the results into something useful. It’s advanced features facilitate transforming & exporting data into different formats, looking-up and validating values using web services, and the ability to script complex parsing rules. The most powerful aspect of GoogleRefine lies in each change being ‘undo-able’ (if you make a mistake) and repeatable (so that data can be transformed through a series of iterations). GoogleRefine operates from a small footprint and in most cases can run without administrative rights which is a big plus, and it’s Open Source and completely free!

So, enough back-story – let’s see GoogleRefine in action! I’m going to skip the basics of installation etc and jump right in a real-world example, but if you need help getting set up, there is plenty of easy-to-follow documentation available (http://code.google.com/p/google-refine/wiki/GettingStarted). Here’s what the tool should look like once it’s up and running:

My demonstration is going be based on cleaning up list of author’s names – the data is fictional but the problem is representative of the typical jobs for which I use Refine. You’ll notice that the followng list of names contains variations of title, typographic errors, different ordering:

Mr Bryan Albright
Dr Regina Troupe
Ms Natasha Lewis
Prof. Amy Seal
Dr Leblanc, Keith
Dr Keith Leblanc
Dr Melinda Chester
Natasha Lewis
Mr Brandon Runnels
Mr Michael Redd
Prof Suzanne Rawlins
Dr Brandon Runnels
Prof Susanne Rawlins

First off, we need to load the data into Refine…

 

We just need to select ‘CSV’ as the input format, customise the character encoding (if needed), and assign a name to out project (I’m using “Names”), then click “Create Project”. Now we’re on to the exciting stuff…

Clicking on the name of any column in our dataset opens the Facet menu, where various group, sorting and transformation functions can be accessed. Let’s create a basic text facet by clicking Author -> Facet -> Text facet.

Here’ we’re picked up two values for Keith Leblanc with different formatting. Select Merge and Close – we’re returned to our original project view, only now we can see that the two original values representing Keith Leblanc have been merged into one. When it comes to clustering functions, there’s no shortage options – for example:

Nearest Neighbor – Levenshtein Distance

Nearest Neighbor – PPM

Metaphone analysis (comparison based on strings “sound” similar when spoken)

When we’re satisfied with our matches, we simply merging & then export the data to whatever format we need. We can even define our own data format templates, but that’s a topic for another day.

 

As we’ve seen, the tool is really flexible but this tutorial has really only scraped the surface of Refine’s features. It’s flexible, scriptable & it’s true power lies in the ability to quickly and conveniently lookup and validate data in external APIs & services. Stay tuned for another post in coming weeks on how we can combined Refine with the Elsevier Scopus API to extract a count of index publication records for our authors.

References:
GoogleRefine – http://code.google.com/p/google-refine/
GoogleRefine 2.0 Introduction – http://www.youtube.com/watch?v=B70J_H_zAWM
Getting Started Guide – http://code.google.com/p/google-refine/wiki/GettingStarted