Parallel Python
Home arrow Forums arrow Python Forums arrow Parallel Python Forum arrow how to prevent job rescheduling?
Parallel Python Community Forums rss  
July 29, 2014, 07:56:55 PM *
Welcome, Guest. Please login or register.

Login with username, password and session length
News: Parallel python forum is up and running!
 
   Home   Help Search Login Register  
Pages: [1]
  Print  
Author Topic: how to prevent job rescheduling?  (Read 1988 times)
0 Members and 2 Guests are viewing this topic.
KSJ
New Python
*

Karma: 0
Posts: 3


View Profile
« on: May 08, 2012, 10:50:43 PM »

Hi all,
I use pp for a calculation, found some strange thing and have some question:

I use time.time() to know the whole process spend about 168.969000101 sec

print job statistics:

Job execution statistics:
 job count | % of all jobs | job time sum | time per job | job server
         3 |          4.11 |       0.0000 |     0.000000 | 192.168.1.101:59999
        43 |         58.90 |       0.0000 |     0.000000 | 140.112.63.243:60000
        14 |         19.18 |       0.0000 |     0.000000 | 140.112.63.132:59996
         6 |          8.22 |     236.5780 |    39.429667 | local
         7 |          9.59 |       0.0000 |     0.000000 | 192.168.1.168:59997
Time elapsed since server creation 168.969000101
WARNING: statistics provided above is not accurate due to job rescheduling

I don't know when pp will rescheduling?
1.send job to workerA but it has no response or still calculating for certain sec?
  ( is that depend on TRANSPORT_SOCKET_TIMEOUT ?? )
  ( I try to change TRANSPORT_SOCKET_TIMEOUT  to 3600 but no use )
2.pp find that workerA is faster than workerB, so pp send the same job to workerA and cancel workerB??
  (the speed of these computers are very different)

Why local job time sum is 236 > 168 ?? b/s rescheduling??

I try to divide job into small piece, and job statistics work great
  ( but may spend more time while communicate, right? ),
  so what could I setting for use large task with job statstics

thx
Logged
Vitalii
Global Moderator
Parallel Python
*****

Karma: 2
Posts: 518


View Profile WWW
« Reply #1 on: June 02, 2012, 12:22:32 AM »

Why local job time sum is 236 > 168 ?? b/s rescheduling??
That is because it run them in parallel.
To debug it better please run ppserver.py with -d flag.
Logged

Pages: [1]
  Print  
 
Jump to:  

Powered by MySQL Powered by PHP Powered by SMF 1.1.16 | SMF © 2011, Simple Machines Valid XHTML 1.0! Valid CSS!
Nutrition facts and analysis