Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random Forest - Only one branch generated in Vivado HLS #71

Open
RolandJohnson76 opened this issue Aug 1, 2024 · 1 comment
Open

Random Forest - Only one branch generated in Vivado HLS #71

RolandJohnson76 opened this issue Aug 1, 2024 · 1 comment

Comments

@RolandJohnson76
Copy link

Hello Community,

I've been having a play with RandomForests with conifer, but for some reason even though I can see the tree being created and the tree details in parameters.h, my decision_function.vhd only includes one branch of the tree.

I'm using a custom dataset that I extracted from a program running in an FPGA with a RF of 1 tree with 3 layers - in reality, I would like around 100 trees with a depth of around 5, but I'm using few trees in this instance just to debug this issue.

Code is below:

Load dataset and assign names

col_names = ['Row 0', 'Row 1', 'Row 2', 'Row 3', 'Row 4', 'Row 5', 'Row 6', 'Row 7', 'Row 8', 'Row 9', 'Row_Delta', 'Row_Mean', 'Row_Min', 'Row_Max', 'Row_Width', 'Col 0', 'Col 1', 'Col 2', 'Col 3', 'Col 4', 'Col 5', 'Col 6', 'Col 7', 'Col 8', 'Col 9', 'Col_Delta', 'Col_Mean', 'Col_Min', 'Col_Max', 'Col_Width', 'Label']

rs_data = pd.read_csv("/path/to/dataset/ML_Dataset_1_256_29072024.csv", header=None, names=col_names)

Assign features and target

feature_cols = ['Row 0', 'Row 1', 'Row 2', 'Row 3', 'Row 4', 'Row 5', 'Row 6', 'Row 7', 'Row 8', 'Row 9',
'Row_Delta', 'Row_Mean', 'Row_Min', 'Row_Max', 'Row_Width',
'Col 0', 'Col 1', 'Col 2', 'Col 3', 'Col 4', 'Col 5', 'Col 6', 'Col 7', 'Col 8', 'Col 9',
'Col_Delta', 'Col_Mean', 'Col_Min', 'Col_Max', 'Col_Width']

X = rs_data[feature_cols] # Features
y = rs_data.Label # Target variable

Split data

X_train_val, X_test, y_train_val, y_test = train_test_split(X, y, test_size=0.3, random_state=1) # 70% training and 30% test

Scale data

scaler = preprocessing.StandardScaler().fit(X_train_val)
X_train_val = scaler.transform(X_train_val)
X_test = scaler.transform(X_test)

Train with RandomForestClassifier

train = True
if train:
clf = RandomForestClassifier(n_estimators=1, max_depth=3, random_state=0)
clf.fit(X_train_val, y_train_val)
if not os.path.exists('ram_sniffer_rf'):
os.makedirs('ram_sniffer_rf')
joblib.dump(clf, 'ram_sniffer_rf/bdt.joblib')
else:
clf = joblib.load('ram_sniffer_rf/bdt.joblib')

Create and compile the model

cnf = conifer.converters.convert_from_sklearn(clf, cfg)
cnf.compile()

Run HLS C Simulation and get the output

y_hls = cnf.decision_function(X_test)
y_skl = clf.predict_proba(X_test)

Before running the final line, I go into "build_hls.tcl" and remove references of "-flow_target", as reported in issue #67

Synthesize the model

cnf.build(csim=False, cosim=False, export=True)

The build works and I get a message below saying "True"

In my parameters.h file, I have 7 weights, which is what I would expect to see for one tree with 3 layers, however in my decision_function.vhd file in my_prj/solution1/syn/vhdl I only have 3 weights, which makes it seem like there's only one branch being implemented.

Have you seen something like this before?
What am I doing wrong?

Thanks in advance!

@thundertwonk001
Copy link

Good morning, I found an alternative method for implementing forest algorithms in FPGAs, so please close this thread. I would be happy to share the method for those who are interested!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants