distributed(grid) file search, find duplicates, full text index, using http/tcp connections, very high speed using multi-threading/parallel computing, automatically detect physical harddisks and accelerate with NTFS USN Journal, and customized serialization compression caching.

many features are coming very soon, this application IS a SERIOUS project, and it's rapidly improving, one big feature every one or two days.

Current Features:
  • Distrubuted: search on different computer through internet via HTTP or TCP.
  • Safe Access: built-in RBAC.
  • Fast: multi-thread concurrent grid computing.
  • Efficient: compressed hash, full text index with local serialization storage and compressed network transfer.
  • Intelligent:
    • detect different physical hard disks(NOT partition) and automatically accelerate by parallel computing.
    • detect NTFS and automatically use USN Journal to gain 10x faster file search
  • Extensible: support different full text index interface.


快速高效智能的分布式文件管理,支持压缩的全文索引和网络传输,使用并行计算技术。

特点
  • 分布式:支持通过互联网查找任意多计算机,支持TCP/HTTP;
  • 访问安全:基于角色的访问控制(RBAC),支持定义远程访问的账户、允许访问的目录等;
  • 快速:充分发挥多核CPU的性能,自动进行并行计算;
  • 智能:
    • 自动识别同一部机器上的不同物理磁盘,自动加速;
    • 自动使用NTFS特性快速检索文件,比普通检索速度提升10倍以上;
  • 高效:哈希值、全文索引快速存取,网络压缩传输;
  • 可扩展性内容搜索:内容匹配、全文索引,支持接口;

即将添加对文件同步的支持



How To Use

Local Search for exact same files:

string duplicateFolder = @"d:\backup";
Dictionary<string, MatchFileItem> result;
BaseWork worker = new WorkV6();
result = worker.Find(new FileURI[] { new FileURI(@"c:\download\"),  new FileURI(duplicateFolder)}
           , new string[] { }, "", SearchTypes.Size | SearchTypes.Name, MatchTypes.ContentSame);
List<string> duplicatedFiles = worker.FindAll(result, duplicateFolder);

Distributed Search for files contains a string(full text):

//first initialize some settings
WorkUtils.Initialiaze(new KeyValuePair<string, string>(), CompressionMethods.GZip
, new System.Net.WebProxy(), true, OnStorageBeforeFileProcess);


  • Server
BaseManager manager = new WorkV6HTTPManager();//you can use WorkV6TCPManager
Role role = new Role("user");
role.AddFilePath(@"c:\temp\New Folder");
role.AddUser(new UserAuth("user", "pass"));
role.AddRight(UserRights.Discover);
role.AddRight(UserRights.Search);
role.AddRight(UserRights.Delete);
manager.Roles.AddRole(role);
manager.Progress += new EventHandler<ProgressEventArgs>(OnWorkProgress);
manager.Request += new EventHandler<DataTransferEventArgs>(OnManagerRequest);
manager.Start(8880);

  • Client
FileURI remoteFolder = new FileURI(@"202.2.3.4:8880/e:\temp\New Folder", "user", "pass", 0, ObjectTypes.RemoteURI);
Dictionary<string, MatchFileItem> result;
result = Worker.Find(new FileURI[] { new FileURI(@"c:\download\"),  remoteFolder }
            , new string[] { }, "YOUR KEYWORD HERE", SearchTypes.Size, MatchTypes.ContentExtract);
List<string> matchFiles = Worker.FindAll(result, string.Empty);



workflow:




A detail document could be found at codeproject: http://www.codeproject.com/KB/IP/filio.aspx

Last edited Aug 3, 2010 at 7:04 AM by unruledboy, version 40