ASUS TUF Gaming F17 - MCA Semester 2 AIML Practical Manual

Project Overview

This VS Code extension is created to professionally showcase MCA Semester 2 Artificial Intelligence and Machine Learning practical assignments under NEP 2020, Mumbai University.

Objective

The objective of this extension is to present AIML practical programs in a clean, professional, and portfolio-ready format.

AIML Practicals Covered

No.	Practical / Assignment
1	Water Jug Problem using DFS in Prolog
2	Tic-Tac-Toe Game in Prolog
3	Python Fundamentals: Data Types, if-elif, Functions
4	NumPy Practicals
5	Pandas Practicals
6	Data Visualization using Matplotlib and Pandas
7	8 Puzzle using Hill Climbing
8	Perceptron OR Operation
9	Perceptron using Stochastic Gradient Descent
10	ADALINE AND Operation
11	Dimensionality Reduction
12	Titanic Logistic Regression
13	SVM and Kernels
14	K-Means Clustering
15	Boosting Algorithms

Technologies Used

Prolog
Python
NumPy
Pandas
Matplotlib
Seaborn
Scikit-learn
VS Code
Node.js

Software Requirements

Visual Studio Code
Node.js
Python
SWI-Prolog
Jupyter Notebook
NumPy
Pandas
Matplotlib
Seaborn
Scikit-learn

Practical 1: Water Jug Problem using DFS in Prolog

Question

Water Jug problem using Depth First Search in Artificial Intelligence.

Aim

To obtain exactly 4 liters of water in the 5-liter jug using 5-liter and 3-liter jugs with Depth First Search.

Program

% Practical 1 - Water Jug Problem using DFS
% Jug capacities: 5L and 3L
% Goal: Get exactly 4L in 5L jug

goal((4, _)).

move((X, Y), (5, Y)) :- X < 5.
move((X, Y), (X, 3)) :- Y < 3.
move((X, Y), (0, Y)) :- X > 0.
move((X, Y), (X, 0)) :- Y > 0.

move((X, Y), (X1, Y1)) :-
    X > 0,
    Y < 3,
    T is min(X, 3 - Y),
    X1 is X - T,
    Y1 is Y + T.

move((X, Y), (X1, Y1)) :-
    Y > 0,
    X < 5,
    T is min(Y, 5 - X),
    X1 is X + T,
    Y1 is Y - T.

dfs(State, Path, Path) :-
    goal(State).

dfs(State, Visited, Path) :-
    move(State, NextState),
    \+ member(NextState, Visited),
    dfs(NextState, [NextState | Visited], Path).

solve :-
    dfs((0, 0), [(0, 0)], Path),
    reverse(Path, Solution),
    write('Solution Path:'), nl,
    print_path(Solution).

print_path([]).
print_path([H | T]) :-
    write(H), nl,
    print_path(T).

Run

?- consult('water_jug.pl').
?- solve.

Practical 2: Tic-Tac-Toe Game in Prolog

Question

Prolog program to implement Tic-Tac-Toe game.

Aim

To implement a two-player Tic-Tac-Toe game using Prolog.

Program

% Practical 2 - Tic Tac Toe Game in Prolog

play :-
    Board = [1,2,3,4,5,6,7,8,9],
    game(Board, x).

display([A,B,C,D,E,F,G,H,I]) :-
    nl,
    write(A), write(' | '), write(B), write(' | '), write(C), nl,
    write('--+---+--'), nl,
    write(D), write(' | '), write(E), write(' | '), write(F), nl,
    write('--+---+--'), nl,
    write(G), write(' | '), write(H), write(' | '), write(I), nl.

win(P, [P,P,P,_,_,_,_,_,_]).
win(P, [_,_,_,P,P,P,_,_,_]).
win(P, [_,_,_,_,_,_,P,P,P]).
win(P, [P,_,_,P,_,_,P,_,_]).
win(P, [_,P,_,_,P,_,_,P,_]).
win(P, [_,_,P,_,_,P,_,_,P]).
win(P, [P,_,_,_,P,_,_,_,P]).
win(P, [_,_,P,_,P,_,P,_,_]).

draw(Board) :-
    \+ member(1, Board), \+ member(2, Board), \+ member(3, Board),
    \+ member(4, Board), \+ member(5, Board), \+ member(6, Board),
    \+ member(7, Board), \+ member(8, Board), \+ member(9, Board).

move([Pos|T], Pos, P, [P|T]) :- number(Pos).
move([H|T], Pos, P, [H|R]) :- move(T, Pos, P, R).

game(Board, Player) :-
    display(Board),
    write('Player '), write(Player), write(' enter position 1 to 9: '),
    read(Pos),
    ( member(Pos, Board), number(Pos) ->
        move(Board, Pos, Player, NewBoard),
        ( win(Player, NewBoard) ->
            display(NewBoard), write('Player '), write(Player), write(' wins!'), nl
        ; draw(NewBoard) ->
            display(NewBoard), write('Game draw!'), nl
        ; change(Player, Next), game(NewBoard, Next)
        )
    ; write('Invalid move! Try again.'), nl, game(Board, Player)
    ).

change(x, o).
change(o, x).

Run

?- consult('tic_tac_toe.pl').
?- play.

Practical 3: Python Fundamentals – Data Types, if-elif, Functions

Question

Implement Python programs based on data types, string operations, list operations, dictionary operations, tuple, set, loops, functions, and if-elif statements.

Aim

To understand basic Python programming concepts and implement fundamental operations using Python.

Program

# Assignment 2A - DataTypes, if-elif, Functions

print(2**10)

n1 = 10
n2 = 20
n3 = 30
print("sum of {} and {} is {}".format(n1, n2, (n1 + n2)))

str1 = "SIESCOMS Sector-5 Plot-1E Nerul 200706"
list0 = str1.split()
print(list0)
print(list0[3])

str2 = "Master of Computer Applications"
words = str2.split()
print(words[2])

str3 = "SIESCOMS&VESIT&MET&STERLING&BVIT"
wordList = str3.split("&")
print(wordList)
print(wordList[0])

my_list = ['a', 'b', 'c']
my_list.append('d')
my_list.append('e')
my_list.append('f')
print(my_list)

nest = ['one','two','three',['four','five','siescoms',['nerul','navi mumbai'],['400706']]]
print(nest[2])
print(nest[3][2])
print(nest[3][3][1])
print(nest[3][3][0][2])
print(nest[3][4][0])
print(nest[3][4][0][3])

lst = [1,2,[3,4],[5,[100,200,['hello']],23,11],1,7]
print(lst[3][1][2][0])

thisdict = {"brand": "Ford", "model": "Mustang", "year": 1964}
print(thisdict)

d1 = {
    'India': {'States': ['MAH','DEL','CHN','KOL']},
    'US': {'States': ['NY','CAL','WSH','TXS','FLO']},
    'EUROPE': {'States': ['SPN','ITL','GER','FRN']}
}
print(d1)
print(d1['US']['States'])
print(d1['India']['States'])
if 'CHN' in d1['India']['States']:
    print("State = CHN")
else:
    print("State not found")

people = {
    1: {'name': 'John', 'age': '27', 'gender': 'Male'},
    2: {'name': 'Marie', 'age': '22', 'gender': 'Female'}
}
print(people)
print(people[2])
print(people[2]['gender'])

s = "Master of Computer Applications"
print(s[0:6])
str2 = 'Master of Computer Applications'
print(str2[10:18])

planet = "Earth"
diameter = 12742
print("The diameter of {} is {} kilometers".format(planet, diameter))

d = {'k1':[1,2,3,{'tricky':['oh','man','inception',{'target':[1,2,3,'hello']}]}]}
print(d['k1'][3]['tricky'][3]['target'][3])

primarycolors = ('red','blue','yellow')
print(primarycolors)
primarycolors = ('orange',) + primarycolors[1:]
print(primarycolors)

mySet = {1,2,3,4,5,1,1,1,1,1,3,3,2,2,4,4,5}
print(mySet)

list1 = [1,2,3,4,5,1,1,1,1,1,3,3,2,2,4,4,5]
print(set(list1))

set1 = {100,100,200,300,400,400,500}
print(set1)
set1.add(500)
print(set1)

lstnos = [1,2,3,4,5,6]
for num in lstnos:
    print(num*num)

num = 1
while num < 11:
    print(num)
    num += 1

range0 = range(0,500)
print(list(range0))

x = range(0,101)
xList = list(x)
print(xList)

nums = []
for i in range(0,100):
    nums.append(i*i)
print(nums)

def cubeFunc(userInput=1):
    print(userInput**3)

cubeFunc(6)
cubeFunc()

def email(userEmail):
    emailStr = userEmail.split('@')
    return emailStr[1]

print(email('xyz@sies.edu.in'))

def countDog(st):
    countD = 0
    for w in st.split():
        word = w.lower().strip(".,!?")
        if word == "dog":
            countD += 1
    return countD

text = "The dog is a pet animal. A dog has sharp teeth. Dogs are sometimes called canines."
print(countDog(text))

def caught_speeding(speed, is_birthday):
    if is_birthday:
        speed += 5
    if speed <= 60:
        return "No ticket"
    elif speed <= 80:
        return "Small ticket"
    else:
        return "Big ticket"

print(caught_speeding(81, True))
print(caught_speeding(54, True))
print(caught_speeding(59, False))

Practical 4: NumPy Practicals

Question

Implement NumPy programs for array creation, matrix operations, reshaping, random number generation, and statistical calculations.

Aim

To perform numerical computations and matrix operations using the NumPy library.

Program

import numpy as np

print(np.zeros(10))
print(np.ones(10))
print(np.ones(10) * 5)
print(np.arange(10, 51))
print(np.arange(10, 51, 2))
print(np.arange(9).reshape(3, 3))
print(np.identity(3))
print(np.random.uniform(0, 1))
print(np.random.normal(size=25))

arr = np.arange(1, 101) / 100
print(arr.reshape(10, 10))
print(np.linspace(0, 1, 20))

mat = np.arange(1, 26).reshape(5, 5)
print(mat)
print(mat[3, 4])
print(mat.sum())
print(np.std(mat))
print(np.sum(mat, axis=0))
print(np.sum(mat, axis=1))

Practical 5: Pandas Practicals CSV-Free

Question

Implement Pandas programs for DataFrame creation, column selection, missing value handling, grouping, and employee data analysis without using CSV files.

Aim

To perform data analysis and data manipulation using the Pandas library.

Program

import pandas as pd
import numpy as np

print(pd.DataFrame([1, 2, 3, 4, 5]))

df1 = pd.DataFrame({'Numbers': [1, 2, 3, 4, 5]})
print(df1)

print(pd.DataFrame({'Numbers': [1, 2, 3, 4, 5]}, index=['one', 'two', 'three', 'four', 'five']))

print(pd.DataFrame({'Name': ['John', 'Jack', 'Ryan'], 'Age': [11, 15, 18]}))

data = {'Name': ['Tom', 'Jack', 'Steve', 'Ricky'], 'Age': [28, 34, 29, 42], 'Mobile': [1234, 5678, 9876, 5432]}
df4 = pd.DataFrame(data)
print(df4)
print(df4['Name'])
print(df4.loc[df4['Name'] == 'Jack', 'Name'].values[0])
print(df4[['Name', 'Mobile']])

data = {'Name': ['Tom', 'Jack', 'Steve', 'Ricky', 'Greg'], 'Age': [28, 34, 29, 42, 54], 'Mobile': [1234, 5678, 9876, 5432, 5555]}
df5 = pd.DataFrame(data, index=['A', 'B', 'C', 'D', 'E'])
df5['m1'] = [55, 78, 90, 89, 78]
df5['m2'] = [85, 89, 79, 80, 89]
df5['Total'] = df5['m1'] + df5['m2']
df5['remarks'] = ['F', 'P', 'P', 'P', 'P']
print(df5)
df5.drop('remarks', axis=1, inplace=True)
df5.drop(index='D', inplace=True)
print(df5)
print(df5.shape)

data = {'Name': ['Harry', 'Lucy', 'Gerome', 'Steve'], 'Jan': ['P', 'P', 'A', np.nan], 'Feb': ['P', np.nan, 'A', np.nan], 'Mar': ['A', 'P', np.nan, np.nan], 'Apr': ['A', 'P', 'P', 'P'], 'May': ['P', 'P', 'P', 'P']}
df6 = pd.DataFrame(data)
print(df6)
print(df6.dropna())
print(df6.dropna(axis=1))
print(df6.fillna('Not Marked'))

marksdata = [['Jack', np.nan, 78, 90, 'Mumbai'], ['John', np.nan, np.nan, 90, 'Pune'], ['Arnold', 76, 50, 90, 'Mumbai'], ['Steven', 90, 78, np.nan, 'Nashik'], ['Juey', 78, 89, np.nan, 'Pune']]
df7 = pd.DataFrame(marksdata, columns=['Name', 'm1', 'm2', 'm3', 'City'])
print(df7)
print(df7.isna())
print(df7.fillna(75))
print(df7.fillna(df7.mean(numeric_only=True)))
print(df7.ffill())
print(df7.groupby('City').mean(numeric_only=True))
print(df7.bfill())

empdf = pd.DataFrame({
    'First Name': ['Tom', 'Jack', 'Steve', 'Ricky', 'Greg'],
    'Gender': ['Male', 'Male', 'Male', 'Male', 'Female'],
    'Team': ['IT', 'HR', 'IT', 'Finance', 'HR'],
    'Salary': [50000, 60000, 55000, 70000, 65000],
    'Bonus %': [10, 12, 8, 15, 11]
})
print(empdf)
print(empdf.shape)
print(empdf.info())
print(empdf.head())
print(empdf.head(50))
print(empdf.tail())
print(empdf.tail(20))
print(empdf.drop_duplicates())
print(empdf.columns)
print(empdf.isnull().sum())
empdf.rename(columns={'Bonus %': 'DiwaliBonus'}, inplace=True)
print(empdf.columns)
empdf['Salary'] = empdf['Salary'].fillna(empdf['Salary'].mean())
print(empdf['Salary'].describe())
empdf['Gender'] = empdf['Gender'].fillna('Other')
print(empdf['Gender'].unique())
print(empdf.groupby('Gender').mean(numeric_only=True))
print(empdf.groupby('Team').mean(numeric_only=True))

Practical 6: Data Visualization CSV-Free

Question

Implement data visualization programs using Matplotlib and Pandas without using CSV files.

Aim

To create line charts, bar charts, histograms, scatter plots, box plots, and pie charts using Python visualization libraries.

Program

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

height = [0, 100, 200, 300, 400, 500]
temperature = [30, 28, 25, 22, 20, 18]
plt.plot(height, temperature)
plt.xlabel("Height (m)")
plt.ylabel("Temperature (°C)")
plt.title("Temperature vs Height")
plt.show()

date = ["25/12", "26/12", "27/12"]
temp = [8.5, 10.5, 6.8]
plt.plot(date, temp)
plt.xlabel("Date")
plt.ylabel("Temperature (°C)")
plt.title("Date wise Temperature")
plt.grid(True)
plt.show()

height = [121.9,124.5,129.5,134.6,139.7,147.3,152.4,157.5,162.6]
weight = [19.7,21.3,23.5,25.9,28.5,32.1,35.7,39.6,43.2]
plt.plot(weight, height, marker='*', markersize=10, color='green', linewidth=2, linestyle='dashed')
plt.xlabel("Weight (kg)")
plt.ylabel("Height (cm)")
plt.title("Average Weight vs Height")
plt.show()

df = pd.DataFrame({'Day': ['Day1','Day2','Day3','Day4','Day5'], 'Food': [1200,1500,1700,1600,1800], 'Games': [900,1100,1000,1300,1400], 'Stalls': [800,850,900,950,1000]})
df.plot(kind='line', marker='*', subplots=True, figsize=(10,10))
plt.show()
df.plot(kind='line', marker='*', linewidth=3, linestyle='--')
plt.xlabel("Days")
plt.ylabel("Sales in Rs")
plt.title("Cultural Mela Sales Report")
plt.show()
df.plot(kind='bar', x='Day', title='Cultural Mela Sales', grid=True)
plt.ylabel("Sales in Rs")
plt.show()
df.plot(kind='barh', x='Day', title='Cultural Mela Sales', grid=True, stacked=True)
plt.xlabel("Sales in Rs")
plt.show()

df = pd.DataFrame({'YearsExperience': [1,2,3,4,5,6,7,8,9,10], 'Salary': [25000,30000,35000,40000,45000,50000,60000,65000,70000,80000]})
df['YearsExperience'].plot(kind='hist', title='Histogram of Experience', bins=10, edgecolor='black')
plt.xlabel('Years of Experience')
plt.show()
df['Salary'].plot(kind='hist', title='Histogram of Salary', bins=10, edgecolor='black')
plt.xlabel('Salary')
plt.show()
plt.scatter(df['YearsExperience'], df['Salary'], c=np.random.rand(len(df)), cmap='viridis', marker='D')
plt.xlabel('Years of Experience')
plt.ylabel('Salary')
plt.title('Scatter Plot of Experience vs Salary')
plt.show()

df = pd.DataFrame({'Name': ['Amit','Riya','John','Sara','Raj'], 'Gender': ['M','F','M','F','M'], 'Maths': [85,90,78,88,92], 'English': [80,86,75,82,89], 'Science': [88,91,79,85,90]})
df[['Maths','English','Science']].plot(kind='box', title='Performance Analysis')
plt.ylabel('Marks')
plt.show()
df.boxplot(column=['Maths', 'English'], by='Gender', grid=False)
plt.show()

df = pd.DataFrame({'Category': ['A', 'B', 'C', 'D'], 'Values': [20, 30, 25, 25]})
df.plot(kind='pie', y='Values', labels=df['Category'], autopct='%1.2f%%', figsize=(6,6))
plt.title('Custom Pie Chart')
plt.show()

Practical 7: 8 Puzzle Problem using Hill Climbing

Question

Implement the 8 Puzzle Problem using the Hill Climbing algorithm.

Aim

To solve the 8 Puzzle Problem using a heuristic-based Hill Climbing search technique.

Program

def heuristic(state, goal):
    count = 0
    for i in range(3):
        for j in range(3):
            if state[i][j] != 0 and state[i][j] != goal[i][j]:
                count += 1
    return count

def find_blank(state):
    for i in range(3):
        for j in range(3):
            if state[i][j] == 0:
                return i, j

def print_state(state):
    for row in state:
        print(row)
    print()

def generate_moves(state):
    moves = []
    x, y = find_blank(state)
    directions = {"Up": (-1, 0), "Down": (1, 0), "Left": (0, -1), "Right": (0, 1)}
    for move, (dx, dy) in directions.items():
        nx, ny = x + dx, y + dy
        if 0 <= nx < 3 and 0 <= ny < 3:
            new_state = [row[:] for row in state]
            new_state[x][y], new_state[nx][ny] = new_state[nx][ny], new_state[x][y]
            moves.append((move, new_state))
    return moves

def hill_climbing(start, goal):
    current = start
    print("Start State:")
    print_state(current)
    while current != goal:
        current_h = heuristic(current, goal)
        neighbors = generate_moves(current)
        best_move, best_state = min(neighbors, key=lambda x: heuristic(x[1], goal))
        best_h = heuristic(best_state, goal)
        if best_h >= current_h:
            print("No better move possible. Local maximum reached.")
            break
        print("Move:", best_move)
        print("Heuristic:", best_h)
        print_state(best_state)
        current = best_state
    if current == goal:
        print("Goal State Reached!")

start = [[1,2,3],[4,0,6],[7,5,8]]
goal = [[1,2,3],[4,5,6],[7,8,0]]
hill_climbing(start, goal)

Practical 8: Perceptron Algorithm for OR Operation

Question

Implement the Perceptron Algorithm for OR operation.

Aim

To understand and implement the Perceptron learning algorithm for the OR gate.

Program 1: Manual Perceptron Implementation

inputs = [[0,0],[0,1],[1,0],[1,1]]
outputs = [0,1,1,1]
weights = [0,0]
bias = 0
learning_rate = 1

for epoch in range(5):
    print("Epoch:", epoch + 1)
    for x, target in zip(inputs, outputs):
        summation = x[0] * weights[0] + x[1] * weights[1] + bias
        prediction = 1 if summation >= 1 else 0
        error = target - prediction
        weights[0] += learning_rate * error * x[0]
        weights[1] += learning_rate * error * x[1]
        bias += learning_rate * error
        print("Input:", x, "Target:", target, "Prediction:", prediction)

print("Final Weights:", weights)
print("Final Bias:", bias)
print("Testing OR Gate:")
for x in inputs:
    summation = x[0] * weights[0] + x[1] * weights[1] + bias
    prediction = 1 if summation >= 1 else 0
    print(x, "=", prediction)

Program 2: Perceptron using Scikit-Learn

from sklearn.linear_model import Perceptron
import numpy as np

X = np.array([[0,0],[0,1],[1,0],[1,1]])
y = np.array([0,1,1,1])

model = Perceptron()
model.fit(X, y)

print("Predictions:", model.predict(X))
print("Weights:", model.coef_)
print("Bias:", model.intercept_)
print(model.predict([[1,1]]))

Practical 9: Perceptron using Stochastic Gradient Descent

Question

Implement a Perceptron using Stochastic Gradient Descent.

Aim

To train a Perceptron model using the Stochastic Gradient Descent learning approach.

Program

def predict(row, weights):
    activation = weights[0]
    for i in range(len(row)-1):
        activation += weights[i+1] * row[i]
    return 1 if activation >= 0 else 0

def train_weights(train, learning_rate, epochs):
    weights = [0.0 for i in range(len(train[0]))]
    for epoch in range(epochs):
        sum_error = 0
        for row in train:
            prediction = predict(row, weights)
            error = row[-1] - prediction
            sum_error += error ** 2
            weights[0] += learning_rate * error
            for i in range(len(row)-1):
                weights[i+1] += learning_rate * error * row[i]
        print("Epoch:", epoch+1, "Error:", sum_error)
    return weights

dataset = [[0,0,0],[0,1,1],[1,0,1],[1,1,1]]
weights = train_weights(dataset, 0.1, 10)

print("Final Weights:", weights)

for row in dataset:
    prediction = predict(row, weights)
    print(row[0:2], "Expected:", row[-1], "Predicted:", prediction)

Practical 10: ADALINE Algorithm for AND Operation

Question

Implement the ADALINE Algorithm for AND operation.

Aim

To understand and implement the ADALINE learning algorithm for the AND gate.

Program

import numpy as np

X = np.array([[0,0],[0,1],[1,0],[1,1]])
y = np.array([0,0,0,1])

weights = np.zeros(2)
bias = 0
learning_rate = 0.1
epochs = 10

for epoch in range(epochs):
    total_error = 0
    for i in range(len(X)):
        net_input = np.dot(X[i], weights) + bias
        output = net_input
        error = y[i] - output
        weights = weights + learning_rate * error * X[i]
        bias = bias + learning_rate * error
        total_error += error ** 2
    print("Epoch:", epoch + 1, "Error:", total_error)

print("Final Weights:", weights)
print("Final Bias:", bias)
print("Testing AND Gate:")

threshold = 0.5
for i in range(len(X)):
    net_input = np.dot(X[i], weights) + bias
    prediction = 1 if net_input >= threshold else 0
    print(X[i], "=", prediction)

Practical 11: Dimensionality Reduction CSV-Free

Question

Implement dimensionality reduction techniques including Feature Selection, Standardization, Normalization, Linear Discriminant Analysis, and Principal Component Analysis.

Aim

To understand and apply dimensionality reduction techniques using Python and Scikit-learn.

Program

import pandas as pd
from sklearn.feature_selection import SelectKBest, chi2
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.datasets import load_wine, load_iris
from sklearn.model_selection import train_test_split
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, accuracy_score
from sklearn.decomposition import PCA
import matplotlib.pyplot as plt

df = pd.DataFrame({
    'battery_power': [800,1200,1500,1800,2000],
    'ram': [512,1024,2048,3072,4096],
    'px_height': [500,800,1200,1500,1800],
    'price_range': [0,1,2,3,3]
})
X = df.drop('price_range', axis=1)
y = df['price_range']
bestfeatures = SelectKBest(score_func=chi2, k=2)
fit = bestfeatures.fit(X, y)
feature_scores = pd.DataFrame({'Feature': X.columns, 'Score': fit.scores_})
print(feature_scores.sort_values(by='Score', ascending=False))

df = pd.DataFrame({'Income':[25000,30000,45000,50000,60000], 'Loan':[10000,12000,15000,18000,20000], 'Age':[25,30,35,40,45]})
scaler = StandardScaler()
scaled_df = pd.DataFrame(scaler.fit_transform(df), columns=df.columns)
print(scaled_df.head())
normalizer = MinMaxScaler()
normalized_df = pd.DataFrame(normalizer.fit_transform(df), columns=df.columns)
print(normalized_df.head())

wine = load_wine()
X = pd.DataFrame(wine.data, columns=wine.feature_names)
y = wine.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
for n in [1, 2]:
    lda = LinearDiscriminantAnalysis(n_components=n)
    X_train_lda = lda.fit_transform(X_train, y_train)
    X_test_lda = lda.transform(X_test)
    model = LogisticRegression(max_iter=1000)
    model.fit(X_train_lda, y_train)
    y_pred = model.predict(X_test_lda)
    print("LDA Components:", n)
    print(confusion_matrix(y_test, y_pred))
    print("Accuracy:", accuracy_score(y_test, y_pred))

pca = PCA(n_components=2)
X_train_pca = pca.fit_transform(X_train)
X_test_pca = pca.transform(X_test)
model = LogisticRegression(max_iter=1000)
model.fit(X_train_pca, y_train)
y_pred = model.predict(X_test_pca)
print(confusion_matrix(y_test, y_pred))
print("Accuracy:", accuracy_score(y_test, y_pred))

iris = load_iris()
X = iris.data
y = iris.target
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)
print(pd.DataFrame(X_pca, columns=['PC1', 'PC2']).head())
print("Explained Variance Ratio:", pca.explained_variance_ratio_)
plt.scatter(X_pca[:,0], X_pca[:,1], c=y)
plt.xlabel("Principal Component 1")
plt.ylabel("Principal Component 2")
plt.title("PCA on Iris Dataset")
plt.show()
print("Original Shape:", iris.data.shape)
print("Reduced Shape:", X_pca.shape)

Practical 12: Logistic Regression Titanic Survival CSV-Free

Question

Implement Logistic Regression for Titanic survival prediction using a CSV-free sample dataset.

Aim

To build and evaluate a Logistic Regression model for binary classification.

Program

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score, precision_score, recall_score

df = pd.DataFrame({
    'Survived': [0,1,1,1,0,0,1,0,1,0],
    'Pclass': [3,1,3,1,3,2,1,3,2,3],
    'Sex': ['male','female','female','female','male','male','female','male','female','male'],
    'Age': [22,38,26,35,None,54,28,2,14,None],
    'SibSp': [1,1,0,1,0,0,0,3,1,0],
    'Parch': [0,0,0,0,0,0,0,1,0,0],
    'Fare': [7.25,71.28,7.92,53.1,8.05,51.86,30.0,21.07,30.07,8.45],
    'Embarked': ['S','C','S','S','S','S','C','S','C',None]
})
print(df.head())
print((df.isnull().sum() / len(df)) * 100)
print(df['Survived'].value_counts())
df['Survived'].value_counts().plot(kind='bar', color=['red','green'])
plt.xlabel("Survived")
plt.ylabel("Count")
plt.title("Titanic Survival Count")
plt.show()

df['Age'] = df['Age'].fillna(df['Age'].mean())
df['Embarked'] = df['Embarked'].fillna(df['Embarked'].mode()[0])
df = pd.get_dummies(df, columns=['Sex','Embarked'], drop_first=True)
X = df[['Pclass','Age','SibSp','Parch','Fare','Sex_male']]
y = df['Survived']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=42)
logmodel = LogisticRegression(max_iter=1000)
logmodel.fit(X_train, y_train)
predictions = logmodel.predict(X_test)
cm = confusion_matrix(y_test, predictions)
print(cm)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.title("Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()
print(classification_report(y_test, predictions))
print("Accuracy =", accuracy_score(y_test, predictions))
print("Precision =", precision_score(y_test, predictions, zero_division=0))
print("Recall =", recall_score(y_test, predictions, zero_division=0))

X_short = df[['Pclass', 'Age', 'Fare']].copy()
y_short = df['Survived']
X_train, X_test, y_train, y_test = train_test_split(X_short, y_short, test_size=0.2, random_state=42)
model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("Predictions:", y_pred)
print("Accuracy =", accuracy_score(y_test, y_pred))
new_passenger = pd.DataFrame([[3,25,100]], columns=['Pclass','Age','Fare'])
prediction = model.predict(new_passenger)
print("Survived" if prediction[0] == 1 else "Deceased")

Practical 13: Support Vector Machine and SVM Kernels CSV-Free

Question

Implement Support Vector Machine classification and compare different SVM kernels using CSV-free datasets.

Aim

To classify data using Support Vector Machines and evaluate the performance of different kernel functions.

Program

import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

iris = load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
for kernel in ['linear', 'rbf', 'poly', 'sigmoid']:
    model = SVC(kernel=kernel)
    model.fit(X_train, y_train)
    y_pred = model.predict(X_test)
    print(kernel, "Accuracy =", accuracy_score(y_test, y_pred))

df = pd.DataFrame({
    'customerid': [1,2,3,4,5,6,7,8,9,10],
    'age': [25,35,45,55,60,30,40,50,65,70],
    '401k savings': [10000,30000,80000,120000,150000,20000,60000,100000,180000,200000],
    'retire': [0,0,0,1,1,0,0,1,1,1]
})
print(df.head())
print(df.shape)
print(df.head(100))
sns.pairplot(df, vars=['age', '401k savings'], hue='retire')
plt.show()
df.drop('customerid', axis=1, inplace=True)
X = df.drop('retire', axis=1)
y = df['retire']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)
print("X_train =", X_train.shape)
print("X_test =", X_test.shape)
print("y_train =", y_train.shape)
print("y_test =", y_test.shape)
svmmodel = SVC(kernel='linear')
svmmodel.fit(X_train, y_train)
y_pred = svmmodel.predict(X_test)
print(y_pred)
cm = confusion_matrix(y_test, y_pred)
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues')
plt.title("Confusion Matrix")
plt.xlabel("Predicted")
plt.ylabel("Actual")
plt.show()
print(classification_report(y_test, y_pred, zero_division=0))
print("Accuracy =", accuracy_score(y_test, y_pred))

Practical 14: K-Means Clustering CSV-Free

Question

Implement K-Means Clustering using CSV-free datasets and demonstrate cluster formation, Elbow Method, and clustering on the Iris dataset.

Aim

To understand and apply K-Means Clustering for unsupervised learning.

Program

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import LabelEncoder
from sklearn.cluster import KMeans
from sklearn.datasets import load_iris

df = pd.DataFrame({
    'Country': ['India','USA','France','Germany','Japan','UK','Canada','China','Italy','Brazil'],
    'Latitude': [20.59,37.09,46.22,51.16,36.20,55.37,56.13,35.86,41.87,-14.23],
    'Longitude': [78.96,-95.71,2.21,10.45,138.25,-3.43,-106.34,104.19,12.56,-51.92],
    'Language': ['Hindi','English','French','German','Japanese','English','English','Hindi','French','English']
})
print(df.head())
print("Rows and Columns:", df.shape)
language_mapping = {'English':1, 'Hindi':2, 'French':3, 'German':4, 'Japanese':5}
df['Language'] = df['Language'].map(language_mapping)
print(df.head())
print(language_mapping)
le = LabelEncoder()
df['Country'] = le.fit_transform(df['Country'])
X = df[['Country','Latitude','Longitude']]

kmeansmodel1 = KMeans(n_clusters=2, random_state=42, n_init=10)
kmeansmodel1.fit(X)
plt.scatter(df['Longitude'], df['Latitude'], c=kmeansmodel1.labels_, cmap='rainbow')
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.title("K-Means Clustering (2 Clusters)")
plt.show()

wcss = []
for i in range(1, 11):
    kmeans = KMeans(n_clusters=i, random_state=42, n_init=10)
    kmeans.fit(X)
    wcss.append(kmeans.inertia_)
plt.plot(range(1,11), wcss, marker='o')
plt.xlabel("Number of Clusters")
plt.ylabel("WCSS")
plt.title("Elbow Method")
plt.show()

kmeansmodel2 = KMeans(n_clusters=3, random_state=42, n_init=10)
kmeansmodel2.fit(X)
plt.scatter(df['Longitude'], df['Latitude'], c=kmeansmodel2.labels_, cmap='rainbow')
plt.xlabel("Longitude")
plt.ylabel("Latitude")
plt.title("Final K-Means Clustering")
plt.show()

iris = load_iris()
X = iris.data
kmeans = KMeans(n_clusters=3, random_state=42, n_init=10)
kmeans.fit(X)
labels = kmeans.labels_
centers = kmeans.cluster_centers_
print("Cluster Labels:", labels)
print("Cluster Centers:", centers)
plt.scatter(X[:,0], X[:,1], c=labels)
plt.scatter(centers[:,0], centers[:,1], marker='X', s=200)
plt.xlabel("Sepal Length")
plt.ylabel("Sepal Width")
plt.title("K-Means Clustering on Iris Dataset")
plt.show()
new_data = [[5.1, 3.5, 1.4, 0.2]]
print("Cluster:", kmeans.predict(new_data))

Practical 15: Boosting Algorithms CSV-Free

Question

Implement Boosting and Ensemble Learning algorithms including AdaBoost, Gradient Boosting, and Voting Ensemble.

Aim

To understand and apply ensemble learning techniques for improving classification accuracy.

Program

from sklearn.datasets import load_breast_cancer, load_iris
from sklearn.model_selection import KFold, cross_val_score, train_test_split
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier, VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

data = load_breast_cancer()
X = data.data
y = data.target

adaboost = AdaBoostClassifier(n_estimators=30, random_state=42)
kfold = KFold(n_splits=10, shuffle=True, random_state=42)
results = cross_val_score(adaboost, X, y, cv=kfold)
print("AdaBoost Accuracy =", results.mean())

gradient = GradientBoostingClassifier(n_estimators=100, random_state=42)
results = cross_val_score(gradient, X, y, cv=kfold)
print("Gradient Boosting Accuracy =", results.mean())

model1 = LogisticRegression(max_iter=1000)
model2 = DecisionTreeClassifier(random_state=42)
model3 = SVC(probability=True, random_state=42)
voting_model = VotingClassifier(estimators=[('lr', model1), ('dt', model2), ('svc', model3)], voting='soft')
results = cross_val_score(voting_model, X, y, cv=kfold)
print("Voting Ensemble Accuracy =", results.mean())

iris = load_iris()
X = iris.data
y = iris.target
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

ada = AdaBoostClassifier(random_state=42)
ada.fit(X_train, y_train)
y_pred = ada.predict(X_test)
print("AdaBoost Accuracy:", accuracy_score(y_test, y_pred))

gb = GradientBoostingClassifier(random_state=42)
gb.fit(X_train, y_train)
y_pred = gb.predict(X_test)
print("Gradient Boosting Accuracy:", accuracy_score(y_test, y_pred))

model1 = LogisticRegression(max_iter=200)
model2 = DecisionTreeClassifier(random_state=42)
model3 = SVC(probability=True)
voting = VotingClassifier(estimators=[('lr', model1), ('dt', model2), ('svm', model3)], voting='soft')
voting.fit(X_train, y_train)
y_pred = voting.predict(X_test)
print("Voting Ensemble Accuracy:", accuracy_score(y_test, y_pred))

Important Questions for Practical Examination

Python Basics Important Questions

String Split

str1 = "SIESCOMS Sector-5 Plot-1E Nerul 200706"
list0 = str1.split()
print(list0)

Nested List Indexing

nest = ['one','two','three',['four','five','siescoms',['nerul','navi mumbai'],['400706']]]
print(nest[3][2])
print(nest[3][3][1])

Dictionary Access

people = {
    1: {'name': 'John', 'age': '27', 'gender': 'Male'},
    2: {'name': 'Marie', 'age': '22', 'gender': 'Female'}
}
print(people[2]['gender'])

Set and Unique Values

list1 = [1,2,3,4,5,1,1,1,1,1,3,3,2,2,4,4,5]
print(set(list1))

Function - Cube of Number

def cubeFunc(userInput=1):
    print(userInput**3)

cubeFunc(6)
cubeFunc()

Advanced Important Practical Programs

Important Perceptron OR Gate

from sklearn.linear_model import Perceptron
import numpy as np

X = np.array([[0,0],[0,1],[1,0],[1,1]])
y = np.array([0,1,1,1])

model = Perceptron()
model.fit(X, y)

pred = model.predict(X)
print("Predictions:")
print(pred)
print("Weights:", model.coef_)
print("Bias:", model.intercept_)
print("Test [1,1]:", model.predict([[1,1]]))

PCA on Iris Dataset

from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
import pandas as pd

iris = load_iris()
X = iris.data

pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)

df = pd.DataFrame(X_pca, columns=['PC1', 'PC2'])
print(df.head())

SVM Classification using Iris Dataset

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = SVC(kernel='linear')
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("Accuracy =", accuracy_score(y_test, y_pred))

K-Means Important Program

import matplotlib.pyplot as plt
from sklearn.datasets import load_iris
from sklearn.cluster import KMeans

iris = load_iris()
X = iris.data

kmeans = KMeans(n_clusters=3, random_state=42, n_init=10)
kmeans.fit(X)

labels = kmeans.labels_
centers = kmeans.cluster_centers_

print("Cluster Labels:")
print(labels)

print("Cluster Centers:")
print(centers)

plt.scatter(X[:,0], X[:,1], c=labels)
plt.scatter(centers[:,0], centers[:,1], marker='X', s=200)
plt.xlabel("Sepal Length")
plt.ylabel("Sepal Width")
plt.title("K-Means Clustering on Iris Dataset")
plt.show()

new_data = [[5.1, 3.5, 1.4, 0.2]]
prediction = kmeans.predict(new_data)
print("Cluster:", prediction)

Boosting Algorithms Important Program

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier, VotingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC

iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

ada = AdaBoostClassifier(random_state=42)
ada.fit(X_train, y_train)
y_pred = ada.predict(X_test)
print("AdaBoost Accuracy:", accuracy_score(y_test, y_pred))

gb = GradientBoostingClassifier(random_state=42)
gb.fit(X_train, y_train)
y_pred = gb.predict(X_test)
print("Gradient Boosting Accuracy:", accuracy_score(y_test, y_pred))

model1 = LogisticRegression(max_iter=200)
model2 = DecisionTreeClassifier(random_state=42)
model3 = SVC(probability=True, random_state=42)

voting = VotingClassifier(
    estimators=[('lr', model1), ('dt', model2), ('svm', model3)],
    voting='soft'
)

voting.fit(X_train, y_train)
y_pred = voting.predict(X_test)
print("Voting Ensemble Accuracy:", accuracy_score(y_test, y_pred))

Learning Outcomes

After completing these practicals, students will understand:

Logic programming using Prolog
Python fundamentals
NumPy and Pandas operations
Data visualization
Search algorithms
Perceptron and ADALINE models
Dimensionality reduction
Logistic Regression
SVM and kernels
K-Means clustering
Ensemble learning

Screenshots

Add screenshots here if required.

![Screenshot](images/screenshot.png)

Author

Armaan Bimalpati
MCA Semester 2
Mumbai University
NEP 2020

Conclusion

This extension presents MCA Semester 2 AIML practical programs and important practical questions in a professional VS Code Marketplace portfolio format.

ASUS TUF Gaming F17 AIML Practicals