Parallel Python
Home arrow Forums arrow Python Forums arrow Parallel Python Forum arrow how to schedule processors in a cluster of work stations?
Parallel Python Community Forums rss  
July 31, 2014, 02:23:36 AM *
Welcome, Guest. Please login or register.

Login with username, password and session length
News: Parallel python forum is up and running!
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: how to schedule processors in a cluster of work stations?  (Read 2343 times)
0 Members and 1 Guest are viewing this topic.
tillwemeetagain
New Python
*

Karma: 0
Posts: 1


View Profile
« on: May 15, 2012, 07:35:39 AM »

Hi all
 I am working on a cluster with 4 workstations, each has 12 cores and 24G memory, the computation task is very huge, So I want to run them in parallel, I set ncpus = 0 and use the local machine as the remote server, the code is something like:
ncpus = 0
ppservers=("node1","node2","node3","node4")
job_server = pp.Server(ncpus,ppservers=ppservers)
in each workstation, I started ppserver.py with - w 12 ,but the program seems not to end, so I cancelled it, and instead use the local machine as the local server, the code is:
#ncpus = 0
ppservers=("node1","node2")
job_server = pp.Server(ppservers=ppservers)
I did not start ppserver.py in the local machine, ppserver.py is only started in the remote machine without the -w parameter, and I got the following message:
Job execution statistics:
 job count | % of all jobs | job time sum | time per job | job server
        24 |         50.00 |       0.0000 |     0.000000 | node2:60000
        24 |         50.00 |     2375.0645 |    98.961023 | local
Time elapsed since server creation 233.92422986
WARNING: statistics provided above is not accurate due to job rescheduling.
It looks as if the remote server did not do half of the job.
So now the problem is: How should I set the parameters so that each core can do his part of job?
Logged
Vitalii
Global Moderator
Parallel Python
*****

Karma: 2
Posts: 518


View Profile WWW
« Reply #1 on: June 02, 2012, 12:20:17 AM »

If your jobs are large you might need to increase TRANSPORT_SOCKET_TIMEOUT.
Also you can run ppserver.py with -d flag to get the detailed log.
Logged

Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.16 | SMF © 2011, Simple Machines Valid XHTML 1.0! Valid CSS!
Nutrition facts and analysis