Check if the port for torchrun is open via ncat

From Master of Neuroscience Wiki

Questions to David Rotermund

This is my script (connection_test.sh) that tests if a connection between two computer of a given port is possible:

You need the master_ip and master_port:

master_ip="10.10.10.10"
master_port="40001"
python_file="main.py"

ip_check=`ip addr | grep $master_ip | wc -l`

if [[ $ip_check == "1" ]]
then
    echo "Master"
    echo "OK OK OK OK OK OK OK" | ncat -l -p $master_port
else
    echo "Client"
    ncat $master_ip $master_port

The script needs to be started on the computer with the master ip first.

If this fails, the port is already used or your firewall settings is blocking it.