ValueError Traceback (most recent call last) Cell In[197], line 11 9 # データの標準化 10 scaler = StandardScaler() —> 11 X_train_scaled = scaler.fit_transform(X_train) 12 X_test_scaled = scaler.transform(X_test) 14 # クラス数を取得 File /home/share/temp/venv/lib/python3.8/site-packages/sklearn/utils/_set_output.py:157, in _wrap_method_output.<locals>.wrapped(self, X, *args, **kwargs) 155 @wraps(f) 156 def wrapped(self, X, *args, **kwargs): –> 157 data_to_wrap = f(self, X, *args, **kwargs) 158 if isinstance(data_to_wrap, tuple): 159 # only wrap the first output for cross decomposition 160 return_tuple = ( 161 _wrap_data_with_container(method, data_to_wrap[0], X, self), 162 *data_to_wrap[1:], 163 ) File /home/share/temp/venv/lib/python3.8/site-packages/sklearn/base.py:916, in TransformerMixin.fit_transform(self, X, y, **fit_params) 912 # non-optimized default implementation; override when a better 913 # method is possible for a given clustering algorithm 914 if y is None: 915 # fit method of arity 1 (unsupervised transformation) … 909 ) 911 if self.with_std: 912 # First pass ValueError: Cannot center sparse matrices: pass `with_mean=False` instead. See docstring for motivation and alternatives. Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings…

2024年6月8日

エラーが発生しているのは、データの種類によるものです。エラーメッセージによると、スパース行列を中心化できないため、with_mean=Falseを指定する必要があるとしています。

スパース行列は、ほとんどの要素がゼロである行列です。このような行列は、例えばテキストデータなどの大規模な高次元データを表す際によく使用されます。

データを標準化する際には、スパース行列を直接使うことができますが、中心化は適用できません。そのため、StandardScalerを使用する際にはwith_mean=Falseを指定して中心化を無効にする必要があります。

以下は修正したコードです：

python
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from keras.models import Sequential
from keras.layers import Dense

# データをトレーニングセットとテストセットに分割
X_train, X_test, y_train, y_test = train_test_split(X_combined, y, test_size=0.2, random_state=42)

# データの標準化
scaler = StandardScaler(with_mean=False)  # with_mean=Falseを追加
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# クラス数を取得
num_classes = len(np.unique(y_train))

# モデルの作成
model = Sequential([
    Dense(64, activation='relu', input_shape=(X_train_scaled.shape[1],)),
    Dense(64, activation='relu'),
    Dense(num_classes, activation='softmax')  # 多クラス分類用の出力層
])

# モデルのコンパイル
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# モデルのトレーニング
history = model.fit(X_train_scaled, y_train, epochs=10, batch_size=32, validation_data=(X_test_scaled, y_test))

# テストデータでの予測と評価
loss, accuracy = model.evaluate(X_test_scaled, y_test)
print("Test Loss:", loss)
print("Test Accuracy:", accuracy)

この修正により、スパース行列を使用したデータの標準化が正常に行われるはずです。

未分類

Posted by ぼっち