Skip to content Skip to sidebar Skip to footer

Receiving Random Cost Output On Tensorflow Regression- Python

I am relatively new to tensorflow and I have attempted to adapt some code from a tutorial to process my own data. The data can be found here: https://github.com/z12332/tensorflow-

Solution 1:

Here is a working version of the code, but before I'll offer some notes.

1) One do some reading at tensorflow mnist tutorial. Im particular see why your placeholder sizes are not correct and why we are going to use a one-hot-encoded version of the labels for this task.

2) Consider using cross-entropy cost. It is a more well suited cost for this multiclass task.

3) try not to be too underwhelmed by the performance of this basic model (it does not perform well). Consider exploring the data looking for import features and also look around for what the state of the art performance on this dataset might be.

import tensorflow as tf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.read_csv('/Users/benny/desktop/export.csv')
data_ = df.iloc[1:,9:27]
data_['CRISPR'] = df.iloc[:,30]
data_ = data_.drop(['Diseases'],axis=1)
# we will need the number of classes to be predict# the nunique methods gets us the number of unique labels
nclasses = data["CRISPR"].nunique()
# lets collect of labels here. We'll one-hot-encode them# using pandas.get_dummies()
inputY = pd.get_dummies(data_.iloc[:, -1])

dim = 16
learning_rate = 0.0000001
display_step = 50

X = tf.placeholder(tf.float32, [None, dim])
# Y should define the shape of your labels. # As discussed we're going to need one hot encoded labels for# this prediction task. this line does not define the shape of your input.# we'll define later# Y = tf.placeholder(tf.float32)


train_X = data_.iloc[:200, :-2].as_matrix()
train_X = train_X.fillna(value=0)
train_Y = inputY[:200].as_matrix()

test_X = data_.iloc[200:320, :-2].as_matrix()
test_Y = inputY[200:320].as_matrix()

n_samples = train_Y.size

# Its important we get the shape of the weight and bias matrices# correct, the version in code is:# W = tf.Variable(tf.zeros([dim]), name="weight")# that wont work since we want to be able to multiply [X, W]# to produce a evidence vector for each each example. # the shape of X is [200 x dim] - there should be a weight for each # feature and there 10 classes so  W is [dim, nclasses],
W = tf.Variable(tf.zeros([dim, nclasses]))

# for the bias, there should be one for each class.# b = tf.Variable(tf.zeros([1]), name="bias")
b = tf.Variable(tf.zeros([nclasses]))

# the correct operation here is tf.matmal, I suspect you introduced# this to make your early matrix definiton work in the graph# activation = tf.add(tf.mul(X, W), b)
activation = tf.add(tf.matmul(X, W), b) 

# you forgot the actual model! Assuming you want to do# softmax classification let do:
y = tf.nn.softmax(activation)
# Now let's define or input labels ( we could have called them Y )# as you had them. Notice what we are saying here is expect a matrix# of floats with any number of examples and nclasses number of columns# which is exactly the size of train_Y.  
y_ = tf.placeholder(tf.float32, [None, nclasses])


# we define the cost reflect the fact that our model output is called# y not activations (anymore)# cost = tf.reduce_sum(tf.pow(activation-Y, 2))/(2*n_samples)
cost = tf.reduce_sum(tf.pow(y_ - y, 2))/(2*n_samples)
optimizer =  tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

hm_epochs = 1000

init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

# I imagine what you want to do here is stochastic gradient descent. # I am not sure this is the way to do it. End to check the code I # will train over the entire training data for 1000 repetitions, # similar to the tutorial code.# .....
for i in range(hm_epochs):
    sess.run(optimizer, feed_dict={X: train_X,
                               y_: train_Y})

    if (i) % display_step == 0:
        cc = sess.run(cost, feed_dict={X: train_X,y_: train_Y})
        print "Training step:", '%04d' % (i), "cost=", "{:.9f}".format(cc)


# To check the accuracy ( this is one way of measuring the performance # of an algorithm on a classification task) we will do: (This is # adapted from the tensflow mnist example [code][1]) 

correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={X: test_X,
                              y_: test_Y}))

Post a Comment for "Receiving Random Cost Output On Tensorflow Regression- Python"