新学考成绩释放在即,故更新一下之前写的查询。这半年终于把原来的验证码存在Cookie里改成了session。那么还是来看看这个验证码吧:
验证码形式比较简单。比如:。4位数字,每位为0-8,颜色随机。不过好在数字的位置是固定的。验证码有简单的扭曲处理,不过这个扭曲……看边框,似乎还是生成一个验证码再扭曲。拖进PS,发现背景的杂色一般是灰色小斑点。这种杂色的滤波非常简单,只需要过滤灰色。一般特征就是RGB三个分量差值小,为了防止黑色也被和谐,所以加上任一分量小于128的设定。进一步还发现有浅色的杂色,比如浅紫灰色。那么过滤就靠RGB三个分量相加,结果小于某一值。代码实现如下:
private static boolean isBackgroundColor(int colorInt) { Color color = new Color(colorInt); int inter; inter = Math.abs(color.getRed() - color.getGreen()) + Math.abs(color.getGreen() - color.getBlue()) + Math.abs(color.getRed() - color.getBlue()); return inter < 40 && color.getRed() > 128; } private static boolean isBackgroundColor2(int colorInt) { Color color = new Color(colorInt); return color.getRed()+color.getGreen()+color.getBlue() > 550; }
然后就直接二值化咯:
public static BufferedImage binaryzation(BufferedImage image) throws Exception { int width = image.getWidth(); int height = image.getHeight(); for (int x = 0; x < width; ++x) { for (int y = 0; y < height; ++y) { if (isBackgroundColor(image.getRGB(x, y))) { image.setRGB(x, y, Color.WHITE.getRGB()); } else if(isBackgroundColor2(image.getRGB(x, y))) { image.setRGB(x, y, Color.WHITE.getRGB()); } else { image.setRGB(x, y, Color.BLACK.getRGB()); } } } return image; }
来跑一边看看效果: 。还不错!接下来分割数字。因为有不同程度的拉伸,所以还是分为四位,每位分别识别好了。分割:
public static List<BufferedImage> splitImage(BufferedImage image) throws Exception { List<BufferedImage> digitImageList = new ArrayList<>(); digitImageList.add(image.getSubimage(0, 0, 16, 40)); digitImageList.add(image.getSubimage(16, 0, 19, 40)); digitImageList.add(image.getSubimage(36, 0, 22, 40)); digitImageList.add(image.getSubimage(58, 0, 22, 40)); return digitImageList; }
分割结果: 、、、。分割完就可以来收集每一位数字了:
然后读入:
static { // 装载模型 try { model = new ArrayList<>(); List<BufferedImage> list; for (int i = 0; i <= 3; i++) { list = new ArrayList<>(); for (int ii = 0; ii <= 8; ii++) { list.add(ImageIO.read(new File("captcha/" + i + "/" + ii + ".png"))); } model.add(list); } } catch (Exception e) { System.out.println("Error occurred in reading captcha model: " + e + ", " + e.getLocalizedMessage()); } }
因为字体也没变,所以直接逐像素比对,统计不同像素,取最小的一个数字。统计不同:
private static int diff(BufferedImage img_a, BufferedImage img_b) { int diff = 0; int width = img_a.getWidth(); int height = img_a.getHeight(); for (int x = 0; x < width; ++x) { for (int y = 0; y < height; ++y) { if (img_a.getRGB(x, y) != img_b.getRGB(x, y)) diff++; } } return diff; }
最后就是比对,加入读入、二值化等等如下:
public static String read(BufferedImage image) throws Exception { Filtering.binaryzation(image); List<BufferedImage> imgs = Infer.splitImage(image); BufferedImage cur; String result = ""; int cur_diff, min_diff, min; for (int idx = 0; idx <= 3; idx++) { cur = imgs.get(idx); min_diff = 999; // 初始化一个极大值 min = 0; for (int i = 0; i <= 8; i++) { cur_diff = diff(cur, model.get(idx).get(i)); // System.out.println("Diff for image: "+idx+", "+i+", result: "+cur_diff); if (cur_diff < min_diff) { min_diff = cur_diff; min = i; } } result += min; } return result; }
测试起来,识别率基本就是100%。当然主要是因为验证码太简单了。
评论