DL之DNN:基于sklearn自带california_housing加利福尼亚房价数据集利用GD神经网络梯度下降算法进行回归预测(数据较多时采用mini-batch方式训练会更快)
目录
基于sklearn自带california_housing加利福尼亚房价数据集利用GD神经网络梯度下降算法进行回归预测(数据较多时采用mini-batch方式训练会更快)
该数据包含9个变量的20640个观测值,该数据集包含平均房屋价值作为目标变量和以下输入变量(特征):平均收入、房屋平均年龄、平均房间、平均卧室、人口、平均占用、纬度和经度。
epoch: 20 batch_id: 83 Batch loss 0.5640518069267273
……
epoch: 90 batch_id: 203 Batch loss 0.6403363943099976
epoch: 90 batch_id: 204 Batch loss 0.45315566658973694
epoch: 90 batch_id: 205 Batch loss 0.5528439879417419
epoch: 90 batch_id: 206 Batch loss 0.386596143245697
- import tensorflow as tf
- import numpy as np
- from sklearn.datasets import fetch_california_housing
- from sklearn.preprocessing import StandardScaler
-
- scaler = StandardScaler() 将特征进行标准归一化
- 获取房价数据
- housing = fetch_california_housing()
- m,n = housing.data.shape
- print (housing.keys()) 输出房价的key
- print (housing.feature_names) 输出房价的特征:
- print (housing.target)
- print (housing.DESCR)
-
-
- housing_data_plus_bias = np.c_[np.ones((m,1)), housing.data]
- scaled_data = scaler. fit_transform(housing.data)
- data = np.c_[np.ones((m,1)),scaled_data]
-
- T1、传统方式
- A = tf.placeholder(tf.float32,shape=(None,3))
- B = A + 5
- with tf.Session() as sess:
- test_b_l = B.eval(feed_dict={A:[[1,2,3]]})
- test_b_2 = B.eval(feed_dict={A:[[4,5,6],[7,8,9]]})
- print(test_b_1)
- print(test_b_2)
-
- T2、采用mini-batch方式
- X = tf.placeholder(tf.float32, shape=(None, n + 1), name="X")
- y = tf.placeholder(tf.float32, shape=(None, 1), name="y")
- 采用optimizer计算梯度,设置参数
- n_epochs = 100
- learning_rate = 0.01
- batch_size=100
- n_batches = int(np.ceil(m / batch_size))
- theta = tf.Variable(tf.random_uniform([n + 1, 1], -1.0, 1.0, seed=42), name="theta")
- y_pred = tf.matmul(X, theta, name="predictions")
- error = y_pred - y
- mse = tf.reduce_mean(tf.square(error), name="mse")
- optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)
- training_op = optimizer.minimize(mse)
- init = tf.global_variables_initializer()
-
- 定义mini-batch取数据方式
- def fetch_batch(epoch, batch_index, batch_size):
- np.random.seed(epoch * n_batches + batch_index)
- indices = np.random.randint(m, size=batch_size)
- X_batch = data[indices]
- y_batch = housing.target.reshape(-1, 1)[indices]
- return X_batch, y_batch
- mini-batch计算过程
- with tf.Session() as sess:
- sess.run(init)
- for epoch in range(n_epochs):/gfeMat
- avg_cost = 0.
- for batch_index in range(n_batches):
- X_batch, y_batch = fetch_batch(epoch, batch_index, batch_size)
- sess.run(training_op, feed_dict={X: X_batch, y: y_batch})
-
- if epoch % 10 == 0:
- total_loss = 0
- acc_train = mse.eval(feed_dict={X: X_batch, y: y_batch})
- total_loss += acc_train
- print(acc_train, total_loss)
- print("epoch:",epoch, "batch_id:",batch_index, "Batch loss", total_loss)
-
网站声明:如果转载,请联系本站管理员。否则一切后果自行承担。
加入交流群
请使用微信扫一扫!