Saturday, 8 March 2008

Welcome to the SocSciBot4 Blog

Please post any bugs or problems you have with SocSciBot4 here - these will help other users to know of any problems and the problems will eventually be picked up and solved by me. Please also feel free to respond to any other users' problems if you know of a solution.

16 comments:

Anonymous said...

Dear Mr Thelwall,
Wow perfect you created a SSB Blog!
This project rocks!
But I still didn`t realy understand the crawling algorithm. When I try your "small test" from the tutorial I will get only 3 domains connected with each other if I only look at the dom->dom connections. But for examplte "The above 2043 links are from: cybermetrics.wlv.ac.uk" are not crawled and included?! When does the crawler stop to not craw a followed link? My limitation of links is still at 15.000. Thnaks in advance! Sincerely yours,

Martin

Mike Thelwall said...

Hi Martin,
Thanks for your comment. When you switch to the dom-dom model, you will only see links between the three domains crawled (linkanalysis.wlv.ac.uk, cybermetrics.wlv.ac.uk and socscibot.wlv.ac.uk), with all the pages within each site collapsed to the hosting domain. If you want to see links to domains other than these 3 then change the Link Type Options to include links to sites not crawled, and the other links will disappear.
This is slightly strange behaviour! The reason is that SocSciBot was originally designed to analyse the links between a set of web sites, ignoring all web sites not crawled.
Best wishes,
Mike
PS When crawling linkanalysis.wlv.ac.uk, SocSciBot will crawl all URLs that it can find starting with linkanalysis.wlv.ac.uk but it wont go "off site" to crawl any pages on other sites, even if there are links in pages in linkanalysis.wlv.ac.uk that point to off-site pages.

Joan Girones said...

Dear Mr. Thelwall

I have data collected with SocSciBot3.

Is there any problem to work with them if I install SocSciBot4?

Thanks a lot.

Mike Thelwall said...

Hi Joan,
There should be no problem with using SocSciBot 4 on crawls from SocSciBot 3 because they are designed to use the same format data and results. But SocSciBot 4 is not fully tested yet so please backup your data just in case.
Best wishes,
Mike

Joan Girones said...

Thanks Mr. Thelwall,

Just a couple of additional prosaic questions:

- Is it possible to use SocSciBot4 without to uninstall SocSciBot3? Furthermore, as SocSciBot4 is a pre-release version, can be used both versions alternatively, to detect hypothetical bugs on version 4?

- SocSciBot4 installation file “incorporates SocSciBot Tools and Cyclist”. The previous installations of “SocSciBot Tools” and “Cyclist” must be uninstalled before to install SocSciBot4?

Finally a SocSciBot-related novelty: Networks / Pajeck:
http://vlado.fmf.uni-lj.si/pub/networks/pajek/
(Referenced on “Link Analysis: An Information Science Approach”, p. 172), is now a Wiki:
http://pajek.imfm.si/doku.php

Thanks.

Mike Thelwall said...

Hi Joan,
Thanks for your information about Pajek! Maybe I will create a Wiki for SocSciBot too one day.
You don't need to uninstall any of the previous software before installing and using SocSciBot 4. You should also be able to use them interchangeably - for example collect with SocSciBot 4 and process with SocSciBot3 Tools. Or process with SocSciBot 4 and then view the results in SocSciBot 3 Tools. They all use the same data and results formats and should give the same results except in unusual situations (strange URLs).
If you spot any SocSciBot 4 bugs, please let me know!
Best wishes,
Mike

Joan Girones said...

Dear Mike,

I’m still using MSOffice 2003. It seems that SocSciBot4 Tools only works with a later version because it doesn’t identify Excel 2003 as Excel. Even the Excel logo seems to be from 2007 version.

Is there some way to use Excel 2003 with SocSciBot4 Tools?

Thanks.

Mike Thelwall said...

Hi Joan,
Thanks for your comment. SocSciBot 4 should work with all versions of Excel but the problem might be that it can't find your copy of Excel because it is in a folder named in Spanish or Catalan and SocSciBot is not very clever at coping with different languages. The way around this is to start SocSciBot 4, go to the tools interface and click on the File menu and the "Pajek and Excel locations" option. In the dialog box that appears you will have to find Excel on your own computer and enter the location of the program. Once you have done this and saved the changes, it should always work afterwards.
Hope this helps!
Best wishes,
Mike

Joan Girones said...

Hello Mike,

In fact I wrote the Excel path location on SocSciBot.ini from SocSciBot4.

Just entering on "File" > "Pajek and Excel program locations", without any change action, Excel works fine.

Thanks,
Joan

Madhuri Aj said...

hello Dr. Thelwall how are u
this is madhu i have gone through the SSB but i nor able to get the ADM count summary its give 0's in the number i do not understand what mistake i did while loading.The network diagram i am getting with pajek but not ADM count.

i hope i will get a positive answer

regards
madhu(India)

Mike Thelwall said...

Hello Madhu,
Thanks for your email. If the ADM counts are zero then this suggests that SocSciBot has not found links between your collection of web sites. Is it possible that they don't link to each other or that SocSciBot has not found any links between them that they have?
If they share a domain name - such as if they are all part of the same university - then this can also cause a problem.
Best wishes,
Mike

Madhuri Aj said...

Thank you very much Dr. Thelwall
i will try to rectify the problem
according to your suggestions
regards
madhu(India)

Florian Aubke said...

Dear Mike,
I am struggling with the crawl output. I ran the tutorials to understand how ScoSciBot works. The I crawled my university's website as a test. I am interested in the internal site links, i.e. the structure of the website. The crawl took a long time, but worked (I can see the raw data). However, the ADM count is zero and there are no links shown in the network ( I chose the site self-links and file/page as aggregation level). When comparing the raw data from the tutorial crawls with my data, I figured that my data includes the dot (.modul.ac.at), the tutorial data does not.

I am not sure what I am doing wrong.
Cheers
Florian

Mike Thelwall said...

Dear Florian,
I am sorry about this problem and don't know what the problem is. Please could you put a zip version of the folder for your crawl on the web somewhere for me to download and I will have a look?
Best wishes,
Mike

Florian Aubke said...

Dear Mike,
thank you. You can access the files under http://survey.modul.ac.at/MUScoSciBotData/
I could get the raw data for the directory level I needed, but the report problem remained.
Thanks
Florian

Mike Thelwall said...

Dear Florian,
I am sorry but please could you zip the folders and send me a link to the zip version? I would like to unzip the files and work with them on my development computer to see what the problem is.
Best wishes,
Mike