Project

General

Profile

Bug #5332

cr1k_d doesn't reliably connect to logger

Added by Derik Barseghian over 8 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Normal
Category:
sensor-view
Target version:
Start date:
03/01/2011
Due date:
% Done:

0%

Estimated time:
Bugzilla-Id:
5332

Description

Fairly often cr1k_d fails to connect to the logger. I get output like:

CR1K sample rate should be 30 second(s)
> <MAIN> Open Serial Port: /dev/ttyUSB0 Succefully!
> <MAIN> Connecting to datalogger...
[ERR-recvFrame] timeout (get data 0 bytes)
> [ERR] cmdRING
> Link error
> <MAIN> [ERR] fail to connect datalogger, please check cable
> <MAIN> [ERR] Driver initialization failed
> Serial port /dev/ttyUSB0 closed
> cr1k_d exit

or:

CR1K sample rate should be 30 second(s)
> <MAIN> Open Serial Port: /dev/ttyUSB0 Succefully!
> <MAIN> Connecting to datalogger...
> [ERR] Link State error after cmdRING (09)
> <MAIN> [ERR] fail to connect datalogger, please check cable
> <MAIN> [ERR] Driver initialization failed
> Serial port /dev/ttyUSB0 closed
-> cr1k_d exit

I thought it might be dcd_mgr interfering, but that doesn't seem to be -- in span.sh, which launches both processes, I added a long sleep between them and cr1k_d still periodically fails.

I'll add this to the list of questions/requests for ISI folks.
If we're unable to find a better solution, an alternate is to detect the fail and retry N times.


Related issues

Blocks Kepler - Bug #5341: Belkin f5u409 usb=>serial dongle periodically stops workingNew03/04/2011

History

#1 Updated by Derik Barseghian over 8 years ago

Another failure example:

> <MAIN> Open Serial Port: /dev/ttyUSB0 Succefully!
> <MAIN> Connecting to datalogger...
> <MAIN> dump default cr file...
> cr file: /home/ubuntu/span.dan_mod/span-fep/conf/2sq110s.CR8
> scan rate = 30 second(s)
> [ERR] Response Code is not "complete(0x00)" (0E)
> [ERR] cmdFileDownload
> <MAIN> [ERR] fail to dump default config file
> <MAIN> [ERR] Driver initialization failed
> Serial port /dev/ttyUSB0 closed

#2 Updated by Derik Barseghian over 8 years ago

Dan and I tracked the
[ERR] Response Code is not "complete(0x00)" (0E)
error down to a path parsing problem in pakbus_cmd.c -- the code can't handle a path with a period in it:

------
/* find extension and create dump file name*/
ext = strstr(DCD_CONF+1, ".");

dump_file = (char*) malloc( strlen(ext) + strlen("CPU:conf") + 1 );
-----

It's unclear if the RING error is due to a separate issue.

#3 Updated by Derik Barseghian over 8 years ago

Dan fixed the path parsing bug.
Dan and I both still get the RING errors though, so leaving this bug open.

#4 Updated by Derik Barseghian over 8 years ago

Retry logic is working well, closing.

#5 Updated by Redmine Admin over 6 years ago

Original Bugzilla ID was 5332

Also available in: Atom PDF